nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

digest_genome.py -r don't work with pre-defined values #435

Closed sfpacman closed 3 years ago

sfpacman commented 3 years ago

It should be cseq instead of cs. https://github.com/nservant/HiC-Pro/blob/9922f2369a47047b60a68bf900978ca59f70b3a1/bin/utils/digest_genome.py#L144 if running the follow code , you will get the error shown belows:

python HiC-Pro-3.0.0/bin/utils/digest_genome.py -r "dpnii" -o hg38_dpnii.bed GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta

 Analyzing GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta
Restriction site(s) GATC
 Offset(s) 0
 Traceback (most recent call last):
   File "/some/fake/path/HiC-Pro-3.0.0/bin/utils/digest_genome.py", line 173, in <module>
     contig_names, all_indices = find_re_sites(filename, sequences,  offset=offset)
   File "/some/fake/path/HiC-Pro-3.0.0/bin/utils/digest_genome.py", line 29, in find_re_sites
nservant commented 3 years ago

Also fixed in the devel :( sorry for that !

joreynajr commented 3 years ago

@sfpacman they changed the format of the -r argument in HiC-Pro v3.0.0 and I had to run it like this:

python HiC-Pro-3.0.0/bin/utils/digest_genome.py \
    -r "^GATC" \
    -o hg38_dpnii.bed \
    GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta

Previously it used to run like you said:

python HiC-Pro-3.0.0/bin/utils/digest_genome.py \
    -r "dpnii" \
    -o hg38_dpnii.bed \
    GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta

I had to make this adjustment the other day since I switched from an older HiC-Pro. This update gives much more flexibility for you to use/define new RE's.