ay-lab / mustache

Multi-scale Detection of Chromatin Loops from Hi-C and Micro-C Maps using Scale-Space Representation
MIT License
64 stars 11 forks source link

Mustache crashes on hic file #23

Closed charlesfeigin closed 2 years ago

charlesfeigin commented 3 years ago

Hi, I'm trying to run mustache using a hic file given to me by a collaborator. I installed mustache using conda and was able to run the tests successfully. However, when I run my command (see below) I get the following error (see further below). I've had them re-generate the hi-c file and still get the same error. Any guidance would be greatly appreciated. Thank you!

Command:

mustache -p 64 -f mysample.hic -d 500000 -r 10000 -o mysample_mustache_test1 2>mysample_mustache_test1.err &

Error:

Traceback (most recent call last): File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/mustache/mustache.py", line 396, in read_hic_file temp = straw.straw("KR", f, str(chr1)+":"+str(int(start))+":"+str(int(end)), str(chr2)+":"+str(int(start))+":"+str(int(end)), "BP", res) File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/straw/straw.py", line 511, in straw myFilePos=list1[0] TypeError: 'int' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/cfeigin/.conda/envs/mustache/bin/mustache", line 8, in sys.exit(main()) File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/mustache/mustache.py", line 1137, in main o = regulator(f, args.norm_method, CHRM_SIZE, args.outdir, File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/mustache/mustache.py", line 948, in regulator x, y, v = read_hic_file(f, norm_method, CHRM_SIZE, distance_in_bp, chromosome,chromosome2, res) File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/mustache/mustache.py", line 426, in read_hic_file temp = straw.straw("VC", f, str(chr1)+":"+str(int(start))+":"+str(int(end)), str(chr2)+":"+str(int(start))+":"+str(int(end)), "BP", res) File "/home/cfeigin/.conda/envs/mustache/lib/python3.8/site-packages/straw/straw.py", line 511, in straw myFilePos=list1[0] TypeError: 'int' object is not subscriptable

ay-lab commented 3 years ago

Hi Charles, thanks for using our tool. It seems like mustache cannot find the default normalization data (KR or VC) in your hic file. You can specify your normalization method by "-norm" parameter:

mustache -p 64 -f mysample.hic -d 500000 -r 10000 -norm X -o mysample_mustache_test1

charlesfeigin commented 3 years ago

Thank you for your reply, we found an error that we had made.

Just a quick question before closing this out, do you have the values for all default parameters (e.g. for -d)? These don't seem to be listed on the github page.

charlesfeigin commented 3 years ago

Sorry, adding to the previous question. While I get clear changes in the number of loops called with increasing the -d parameter, I get many loop calls that exceed -d. E.g. if I use -d 250,000bp I get a loop that spans 570,000bp. Is the max distance not a hard cutoff?

ay-lab commented 3 years ago

It should print out the distance it is using (yes, in some cases it needs to change the distance you input). If you are using 10k resolution I suggest you run for 2mb distance (default) and at the end filter the results based on the distance you want. For 5kb and 10kb the default distance is 2mb. I think other than the distance the defaults for other parameters are listed in the parameter table.

charlesfeigin commented 3 years ago

Thanks, I can do that, however, wouldn't this cause over-correction by FDR? E.g. if using Benjamini-Hochberg, p-values will be adjusted relative to the number of tests performed. So if I'm only interested in loops within 250kb, the loops output by the program using 2mb would have been corrected using a count that includes many peaks well outside of that range, decreasing the significance of those within the range that is biologically interesting.

ay-lab commented 3 years ago

I agree but 250k in 10kb resolution is 25 pixels which is not really enough for image processing.