nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
372 stars 181 forks source link

converting allvalidpairs to pairs #619

Open ratheraarif opened 5 months ago

ratheraarif commented 5 months ago

Is there a way to convert allvalidpairs file to pairs file. Actually I need to do some analysis for which I need pairs files but I only have allvalidpairs files.

Best Regards

nservant commented 5 months ago

Yes, that's pretty straightforward. Please look at https://github.com/nf-core/hic/blob/master/modules/local/hicpro/hicpro2pairs.nf

ratheraarif commented 5 months ago

Thank you

ratheraarif commented 5 months ago

This requires going through the nextflow pipeline. Can I do it without going through nf

nservant commented 4 months ago

of course, this is why I put a link to the source code. Basically, in the script section of the nf code, you have the bash command to use to generate the pairs file. You can do it manually ...

ratheraarif commented 4 months ago

Thank you

ratheraarif commented 4 months ago

I got the following error when I try to convert validpairs to pairs

[ti_index_core] the chromosome blocks not continuous at line 7, is the file sorted? [pos 59]

nservant commented 4 months ago

I guess you need to sort the pair file per chromosome coordinate first

ratheraarif commented 4 months ago

Actually I am new to this pipeline. I would appreciate if you point to some pointers for this.

ratheraarif commented 4 months ago

I used the following command to sort the allvalidpairs file sort -V -t $'\t' -k 2,2 -k 5,5 -k 3,3 -k 6,6 "filename.allValidPairs" > filename2.allValidPairs

and the hicpro2pairs script worked well.

Am I missing anything here?

nservant commented 4 months ago

Why do you have a $ in your command ?

ratheraarif commented 4 months ago

Just to show that it is a tab delimited file.

Do I need to change it?

However, I see that removing the $ does not affect the result.. please clarify if I am missing anything here.

Also I want to know that, Do I have to sort the validPairsFile before generating hic or cool file? I am a little fuzzy there.

nservant commented 4 months ago

Which version of pairix are you using ? I had in mind that with the most recent version, the file didn't need to be sorted. That's why I think we commented the sort command in the nextflow code. Anyway, I think it's worth trying both approaches (with or without sort with a recent pairix version)

ratheraarif commented 4 months ago

Program: pairix (PAIRs file IndeXer) Version: 0.3.8

The output of pairs looks like this.

image

I think I am missing the 11th and 12 th column in the pairs file generated out of the hicpro2pairs.