deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
231 stars 70 forks source link

hicMergeDomains bug: `list index out of range` #790

Open cgirardot opened 2 years ago

cgirardot commented 2 years ago

Hi

me again sorry. I am again unlucky with my first try of hicMergeDomains (3.7.2)

Traceback (most recent call last):
  File "/g/funcgen/gbcs/public/software/conda/envs/hicexplorer-3.7.2/bin/hicMergeDomains", line 7, in <module>
    main()
  File "/g/funcgen/gbcs/public/software/conda/envs/hicexplorer-3.7.2/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 428, in main
    create_tree(relationList, mergedListWithId, args.outputTreePlotPrefix, args.outputTreePlotFormat)
  File "/g/funcgen/gbcs/public/software/conda/envs/hicexplorer-3.7.2/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 280, in create_tree
    while int(sList[posSList][1][0][3:]) < int(rList[posRList][1][3:]):
IndexError: list index out of range

thx

shenlinyong commented 2 years ago

Hello! I have the same problem, did you solve it please?

LeilyR commented 2 years ago

could you please provide the command line you have used

shenlinyong commented 2 years ago

能否请您提供您使用的命令行

This software you have developed is fantastic! But I have a problem when I use the hicMergeDomains to merge TAD files in multiple resolutions, how should I solve it?

hicexplorer=3.7.2 python=3.8.8
hicMergeDomains --domainFiles ../fat_5000_Normalize_domains.bed ../fat_10000_Normalize_domains.bed ../fat_25000_Normalize_domains.bed ../fat_50000_Normalize_domains.bed
Saved relation tree of NC_052532.1
Saved relation tree of NC_052533.1
Saved relation tree of NC_052534.1
Traceback (most recent call last):
  File "/home/SLY68/anaconda3/bin/hicMergeDomains", line 7, in <module>
    main()
  File "/home/SLY68/anaconda3/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 428, in main
    create_tree(relationList, mergedListWithId, args.outputTreePlotPrefix, args.outputTreePlotFormat)
  File "/home/SLY68/anaconda3/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 280, in create_tree
    while int(sList[posSList][1][0][3:]) < int(rList[posRList][1][3:]):
IndexError: list index out of range

Also, the error reported when I add a CTGF file using the -p:

hicMergeDomains --domainFiles ../fat_5000_Normalize_domains.bed ../fat_10000_Normalize_domains.bed ../fat_25000_Normalize_domains.bed ../fat_50000_Normalize_domains.bed -p ./CTCF.Broad.Peaks.bed 
Traceback (most recent call last):
  File "/home/SLY68/anaconda3/bin/hicMergeDomains", line 7, in <module>
    main()
  File "/home/SLY68/anaconda3/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 415, in main
    mergedList = create_list_with_protein(args.domainFiles[0], args.minimumNumberOfPeaks, proteinList)
  File "/home/SLY68/anaconda3/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 399, in create_list_with_protein
    bList = compare_boundaries_protein(bList, cList)
  File "/home/SLY68/anaconda3/lib/python3.8/site-packages/hicexplorer/hicMergeDomains.py", line 374, in compare_boundaries_protein
    if posTad < len(bList) and chromPosition < len(cList) and bList[posTad][0] != cList[chromPosition][0][0]:
IndexError: list index out of range

Contents of the file CTCF.Broad.Peaks.bed

$ head CTCF.Broad.Peaks.bed:
NC_052533.1 12768909    12768943    CTGF_1  -   -   37.4128 1.48e-15    3.07e-06
NC_052554.1 3560877 3560911 CTGF_2  -   +   36.5688 6.5e-15 5.06e-06
NC_052538.1 36329267    36329301    CTGF_3  -   -   36.367  9.07e-15    5.06e-06
NC_052536.1 26271359    26271393    CTGF_4  -   +   36.2569 1.08e-14    5.06e-06
NC_052549.1 11566318    11566352    CTGF_5  -   +   36.0367 1.54e-14    5.06e-06
NC_052540.1 23679796    23679830    CTGF_6  -   -   35.844  2.08e-14    5.06e-06
NC_052539.1 29527143    29527177    CTGF_7  -   -   35.844  2.08e-14    5.06e-06
NC_052545.1 47679   47713   CTGF_8  -   +   35.844  2.08e-14    5.06e-06
NC_052543.1 20072122    20072156    CTGF_9  -   -   35.8073 2.2e-14 5.06e-06
NC_052549.1 8808877 8808910 CTGF_10 -   -   34.4797 2.37e-14    1.21e-05
salvzzz commented 1 year ago

Had the same issue but found a solution that maybe can help you as well.

if posTad < len(bList) and chromPosition < len(cList) and bList[posTad][0] != cList[chromPosition][0][0]: IndexError: list index out of range

I realized I had the coordinates of some scaffold-chromosomes in one file but not in others.

Be sure that all chromosomes present in --proteinFile are also found in --domainFiles and vice versa. If hicMergeDomains fails to find a correspondence, it will output that error.

If all chromosomes are matched, it should work. Another possibility, if you cannot avoid having discrepancies between the two file-types, would be to process each chromosome separately.

Hope this can help you!

ashbymorrison commented 9 months ago

Hi, I am having the same problem but I am not using --proteinFile. I am just using --domainFiles produced by the hicMergeTADS function. _domain.bed files that use 1MB, 100KB, 50KB work fine but I get the error below when using 40KB and 20KB _domain.bed files.

$hicMergeDomains --domainFiles IMR90_Wt.1000000_balanced_hicfindtads_domains.bed IMR90_Wt.100000_balanced_hicfindtads_domains.bed IMR90_Wt.50000_balanced_hicfindtads_domains.bed IMR90_Wt.40000_balanced_hicfindtads_domains.bed Saved relation tree of chr1 Saved relation tree of chr2 Traceback (most recent call last): File "/home/users/ashbym/miniconda3/envs/hic_explorer/bin/hicMergeDomains", line 7, in main() File "/home/users/ashbym/miniconda3/envs/hic_explorer/lib/python3.9/site-packages/hicexplorer/hicMergeDomains.py", line 428, in main create_tree(relationList, mergedListWithId, args.outputTreePlotPrefix, args.outputTreePlotFormat) File "/home/users/ashbym/miniconda3/envs/hic_explorer/lib/python3.9/site-packages/hicexplorer/hicMergeDomains.py", line 280, in create_tree while int(sList[posSList][1][0][3:]) < int(rList[posRList][1][3:]): IndexError: list index out of range

It would be great if someone could suggest a fix. Thanks!

Salviaz commented 9 months ago

What is the difference between these files? Try to check if TADs called at different resolutions have some differences in terms of chromosomes.

I have no idea why this would happen, but just to be sure try to double check it using: cat ${FILE} | cut -f 1 | sort | uniq on all input files, and see if something pops out

ashbymorrison commented 9 months ago

Thanks for the advice. I did check the chromosome names and all were normal 'chrN' and matching in each 'domainFiles'. I added some extra debugging output and it looks like the 'sList' that holds TAD info for each chromosome was empty. This only occurred when I included 'domainFiles' with larger bins (1mb to 50kb) along with smaller bins (40kb, 20kb) AND the larger bin domainFiles were listed first in the 'hicMergeDomains' command. When I change the order of the 'domainFiles' so that the smaller bin files are at the beginning then 'hicMergeDomains' command works fine. I am not really sure why. So..

Outputs "IndexError: list index out of range" error

hicMergeDomains --domainFiles IMR90_Wt.1000000_balanced_hicfindtads_domains.bed IMR90_Wt.100000_balanced_hicfindtads_domains.bed IMR90_Wt.50000_balanced_hicfindtads_domains.bed IMR90_Wt.40000_balanced_hicfindtads_domains.bed IMR90_Wt.20000_balanced_hicfindtads_domains.bed

No error

hicMergeDomains --domainFiles IMR90_Wt.20000_balanced_hicfindtads_domains.bed IMR90_Wt.40000_balanced_hicfindtads_domains.bed IMR90_Wt.50000_balanced_hicfindtads_domains.bed IMR90_Wt.100000_balanced_hicfindtads_domains.bed IMR90_Wt.1000000_balanced_hicfindtads_domains.bed

usernicai commented 6 months ago

@ashbymorrison Hello, I have encountered the same problem. Have you found a solution? There are such problems either in order of resolution data from small to large or in order of resolution data from large to small bin

ashbymorrison commented 6 months ago

In my case, the error seemed to be resolved when listing the bins in order of small bins to large bins in the —hicMergeTADS function. When I listed large bins first then small bins I would get the error.

On Mar 15, 2024, at 7:28 PM, HaiTao Zhu @.***> wrote:

@ashbymorrison https://github.com/ashbymorrison Hello, I have encountered the same problem. Have you found a solution? There are such problems either in order of resolution data from small to large or in order of resolution data from large to small bin

— Reply to this email directly, view it on GitHub https://github.com/deeptools/HiCExplorer/issues/790#issuecomment-2001204416, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCSKOTBWP2K6NDDH3VYPQDYYOU5JAVCNFSM5SI7RNVKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBQGEZDANBUGE3A. You are receiving this because you were mentioned.

usernicai commented 6 months ago

In my case, the error seemed to be resolved when listing the bins in order of small bins to large bins in the —hicMergeTADS function. When I listed large bins first then small bins I would get the error. On Mar 15, 2024, at 7:28 PM, HaiTao Zhu @.***> wrote: @ashbymorrison https://github.com/ashbymorrison Hello, I have encountered the same problem. Have you found a solution? There are such problems either in order of resolution data from small to large or in order of resolution data from large to small bin — Reply to this email directly, view it on GitHub <#790 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCSKOTBWP2K6NDDH3VYPQDYYOU5JAVCNFSM5SI7RNVKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBQGEZDANBUGE3A. You are receiving this because you were mentioned.

Thx, But that didn't solve my situation,and i check the chromosome names too