Open nadabrovitxka opened 3 years ago
Hi, thank you for the nice words, they mean a lot!
Since I am currently on holiday, I don't have access to a desktop computer. Therefore "proper" debugging will have to wait until I am back in about two weeks.
However, maybe we can find a workaround in the meantime. The way fanc insulation
is set up is that it will always calculate the insulation scores for the entire genome and then subset to the specified region at a later stage. For your use case this is obviously less than ideal, and I will have to limit that to the chromosome in question.
For now, my advice would be to run the command on the whole genome with
fanc insulation stem_wt_ontarget.hic@5kb stem_wt_ontarget.insulation -w 500000
to generate the .insulation
file. I would also add a couple more window sizes, down to 50kb or so - they will all be stored in the same object and the added calculation time is not so bad compared to calculating them separately later.
Then extract the score subset with
fanc insulation stem_wt_ontarget.insulation sub.insulation -r chr3:34mb-36mb
Then do the conversion to bigwig with
fanc insulation sub.insulation sub_500kb.bw -o bw -w 500000
I hope this works - as I mentioned, I can't test the commands right now.
If the subsetting is still causing issues, maybe you can instead convert the whole genome file to BigWig and subset that with a different tool?
fanc insulation stem_wt_ontarget.insulation stem_wt_ontarget_500kb.bw -o bw -w 500000
So, the first option I had tried myself with an earlier fanc version and had failed right when it couldnt find the normalization vector for the other chromosomes. The current version seems to go through that fine and produced the whole genome insulation file. It then fails at
fanc insulation stem_wt_ontarget.insulation sub.insulation -r chr3:34mb-36mb
fanc insulation: error: Output file cannot be empty when choosing default output format!
BUT, creating a bigwig of the whole genome works well and I got my insulation scores and they do make perfect sense. Thanks a lot, enjoy your vacation.
Thanks for reporting back so quickly. I'm glad you got something usable in the end - I'll look into fixing the issue properly once I am back, so I'll keep this open until then!
Love this toolkit so much. Makes everything so convenient. Great job!
I am using fanc v 0.9.20 and it works well for full matrices but having a slight trouble calculating insulation scores when the matrix is a small subset of the genome (Capture HiC). This command:
fanc insulation stem_wt_ontarget.hic@5kb stem_wt_ontarget.insulation -w 500000 -r chr3:34mb-36mb -o bigwig
on this .hic file is not really restricting the analysis to chr3. It creates this message for each chromosome (except chr3 of course).But even though these errors do not stop the script the insulation score calculation eventually fails with the following message
which maybe comes from the fact that it is reading the whole chromosome 3, which is mostly empty. running it with chr3:34000000-35500000 gives me the same result. Any clue?