Closed joachimwolff closed 5 years ago
Hi @joachimwolff,
Thanks for the feedback. I have released version 0.5.1, which should fix this problem. Please let me know if the pip install continues to fail.
Also, if you have created any cooler files using version 0.4.2, please ensure to update them with hic2cool update
when you have the new version. See here for more info.
Best, Carl
Hi,
Thanks for the update and the warning in respect of division and multiplication of correction factors. Do I understand it correctly that previously you did CorrectedMatrix_i,j = originalMatrix_i,j * correctionFactor_i * correctionFactor_j
and this changes to CorrectedMatrix_i,j = originalMatrix_i,j / correctionFactor_i * correctionFactor_j
therefore the correction factors are just 1/correctionFactor
of the previous?
Do you provide any flag in the cool files with the information which version of hic2cool they were created? I use your API in my software to transform hic to cool files and I don't think my users will like it to take care of this issue on their own.
Best,
Joachim
@joachimwolff
Yes, you understand correctly. For more information you can see this issue, written by the creator of cooler.
To your second question, there is an attribute in the cooler file called generated-by
that keeps track of the hic2cool version used to make the file. hic2cool update
automatically uses it to determine what changes to make -- currently this is only the inversion of weights after version from version 0.5.0 onwards. After updating, the attribute is modified so further updates will not run again on the same file.
You can use cooler or h5py to get the generated-by
info:
from cooler import Cooler
cool = cooler.Cooler(<cooler file>)
# generated-by is in form: 'hic2cool_' + version
cool.info['generated-by']
If you want to programmatically update files from older versions, you can leverage the hic2cool package as so:
from hic2cool import hic2cool_update
# will update the input cooler file directly. silent=True disables command line confirmation
hic2cool_update(<cooler file>, silent=True)
# OR, leave the input file unchanged and write to a new one
hic2cool_update(<cooler file>, <target cooler file>, silent=True)
You can also provide the silent
argument from the command line: hic2cool_update <file> --silent
.
Best, Carl
Hi Carl,
Thanks for your useful reply. I have one more question:
Is it: CorrectedMatrix_i,j = originalMatrix_i,j / correctionFactor_i * correctionFactor_j
as I wrote or CorrectedMatrix_i,j = originalMatrix_i,j / correctionFactor_i / correctionFactor_j
as it is written in the comment from @nvictus?
Thanks a lot.
Joachim
@joachimwolff @nvictus
In general,
Hic balancing: CorrectedMatrix_i,j = count_i,j / (hic_weight_i * hic_weight_j)
Cooler balancing: CorrectedMatrix_i,j = count_i,j * cooler_weight_i * cooler_weight_j
Cooler still uses multiplicative balancing. We just no longer invert the hic weights.
hic2cool < 0.5.0: weight=1/hic_weight
and therefore CorrectedMatrix_i,j = count_i,j * weight_i * weight_j
hic2cool >= 0.5.0: weight=hic_weight
and therefore CorrectedMatrix_i,j = count_i,j / (weight_i * weight_j)
As of hic2cool 0.5.0, the weights from hic are now preserved as divisive weights. cooler ICE still uses multiplicative weights. This change was implemented to reach a standard among multiple tools that did or did not invert juicer weights. It is now expected that downstream tools will expect weights to be divided if they are called KR/VC/VC_SQRT. This was already the case for HiGlass. However since cooler doesn't yet explicitly handle these exceptions, attempting to run cooler.matrix(balance= ... )
for a hic weight will no longer "correctly" handle these weights by taking the inversion into account.
Best, Carl
Thanks a lot for clarification.
We always had than the divisive way to store the data in our HiCExplorer. Anyway, this is an unpleasant situation for our users if they don't know which type of correction factors they have stored in their cooler files.
How about attaching some metadata to the attrs of the weight vectors? We already store some useful metadata when using cooler balance
(you can take a look using the cooler attrs
command).
It could be as simple as divisive: True
. Taken to be False if missing (with the exception of the 3 special cases above).
Hi,
installing hic2cool version 0.5 with pip fails with:
However, version 0.4.2 works.
Best,
Joachim