Closed adadiehl closed 3 years ago
Did you use the chr.sizes file to do the mcool file and the DIP analysis? Because here it is mostly cooler error due to the absence of chr19_KI270915v1_alt in the mcool file?
The short answer is that, based on errors and (lack of) results from SIP, I suspect a problem with the matrix files. At least, they might not contain what SIP is expecting, as they come from the Pore-C method. This may be out of your wheelhouse, but if you can make any sense of the problem, it would be really helpful.
These are not my own data, but are from the recent preprint for the Pore-C method (https://www.biorxiv.org/content/10.1101/833590v1.full.pdf).
Data were retrieved from https://ont-datasets-us-east-1-public.s3.amazonaws.com/20191103.preprint_NA12878.tar.gz and I am using their chromsizes file, which is in results/refgenome/GRCh38.rg.chromsizes. I can only guess that this is the chromsizes file they used to prepare the matrices, which are in data/matrix.
Interestingly, though taking out the unrepresented entries from the chromsizes file appeared to get SIP running properly, I'm not sure things are truly working. For one, the program never returned to the command line after printing the "End of SIP loops are available in loops/" message, requiring a ctrl-c. Worse yet, the 5kbLoops.txt file was empty after the run finished. Looking back through terminal messages, I saw a lot of this...
'11 loops/5kb/chrX/chrX_55000000_64999999.txt 2 loops/5kb/chrX/chrX_10000000_19999999.txt 3 loops/5kb/chrX/chrX_15000000_24999999.txt 10 loops/5kb/chrX/chrX_50000000_59999999.txt 5 loops/5kb/chrX/chrX_25000000_34999999.txt 17 loops/5kb/chrX/chrX_85000000_94999999.txt 12 loops/5kb/chrX/chrX_60000000_69999999.txt 30 loops/5kb/chrX/chrX_150000000_156040895.txt 0 loops/5kb/chrX/chrX_0_9999999.txt 13 loops/5kb/chrX/chrX_65000000_74999999.txt 1 loops/5kb/chrX/chrX_5000000_14999999.txt 7 loops/5kb/chrX/chrX_35000000_44999999.txt 27 loops/5kb/chrX/chrX_135000000_144999999.txt 9 loops/5kb/chrX/chrX_45000000_54999999.txt 4 loops/5kb/chrX/chrX_20000000_29999999.txt ####### End loops detection for chr chrX 0 loops before the FDR filter Filtering value at 0.01 FDR is 10000.0 APscore and 10000.0 RegionalAPscore
Deleting image file for chrX'
Why 0 loops before FDR filter when it looks like it should be ~150?
In case it helps, the Pore-C processing pipeline and snakemake workflows they employ are in these repositories.. https://github.com/nanoporetech/pore-c https://github.com/nanoporetech/Pore-C-Snakemake
Thank you for your time!
Best, Adam
On Wed, Oct 21, 2020 at 1:46 PM PouletAxel notifications@github.com wrote:
Did you use the chr.sizes file to do the mcool file and the DIP analysis? Because here it is mostly cooler error due to the absence of chr19_KI270915v1_alt in the mcool file?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PouletAxel/SIP/issues/9#issuecomment-713744401, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUMZSOZBH52TPT4RQXLS3DSL4M6DANCNFSM4S2C3VYQ .
I will check that,
can you re run the program and try with 10kb resolution like in their paper? -res 10000 -mat 1000 -t 1500 -fdr 0.05
Done... Still no loop calls, though I see different loop counts written to the terminal now...
'Deleting image file for chrUn_KI270391v1 3 loops/10kb/chrX/chrX_15000000_24999999.txt 10 loops/10kb/chrX/chrX_50000000_59999999.txt 5 loops/10kb/chrX/chrX_25000000_34999999.txt 17 loops/10kb/chrX/chrX_85000000_94999999.txt 12 loops/10kb/chrX/chrX_60000000_69999999.txt 30 loops/10kb/chrX/chrX_150000000_156040895.txt 0 loops/10kb/chrX/chrX_0_9999999.txt 13 loops/10kb/chrX/chrX_65000000_74999999.txt 1 loops/10kb/chrX/chrX_5000000_14999999.txt 7 loops/10kb/chrX/chrX_35000000_44999999.txt 27 loops/10kb/chrX/chrX_135000000_144999999.txt 9 loops/10kb/chrX/chrX_45000000_54999999.txt 4 loops/10kb/chrX/chrX_20000000_29999999.txt ####### End loops detection for chr chrX 0 loops before the FDR filter Filtering value at 0.05 FDR is 10000.0 APscore and 10000.0 RegionalAPscore
Deleting image file for chrX'
Hope this helps!
Best, Adam
On Wed, Oct 21, 2020 at 3:48 PM PouletAxel notifications@github.com wrote:
I will check that,
can you re run the program and try with 10kb resolution like in their paper? -res 10000 -mat 1000 -t 1500 -fdr 0.05
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PouletAxel/SIP/issues/9#issuecomment-713833913, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUMZSOCAPDTA3XZSYICI43SL43HHANCNFSM4S2C3VYQ .
Thanks, I will work on it, and come back to you. Don't know, I hope before the end of the week Best Axel
On Wed, 21 Oct 2020 at 16:05, Adam Diehl notifications@github.com wrote:
Done... Still no loop calls, though I see different loop counts written to the terminal now...
'Deleting image file for chrUn_KI270391v1 3 loops/10kb/chrX/chrX_15000000_24999999.txt 10 loops/10kb/chrX/chrX_50000000_59999999.txt 5 loops/10kb/chrX/chrX_25000000_34999999.txt 17 loops/10kb/chrX/chrX_85000000_94999999.txt 12 loops/10kb/chrX/chrX_60000000_69999999.txt 30 loops/10kb/chrX/chrX_150000000_156040895.txt 0 loops/10kb/chrX/chrX_0_9999999.txt 13 loops/10kb/chrX/chrX_65000000_74999999.txt 1 loops/10kb/chrX/chrX_5000000_14999999.txt 7 loops/10kb/chrX/chrX_35000000_44999999.txt 27 loops/10kb/chrX/chrX_135000000_144999999.txt 9 loops/10kb/chrX/chrX_45000000_54999999.txt 4 loops/10kb/chrX/chrX_20000000_29999999.txt ####### End loops detection for chr chrX 0 loops before the FDR filter Filtering value at 0.05 FDR is 10000.0 APscore and 10000.0 RegionalAPscore
Deleting image file for chrX'
Hope this helps!
Best, Adam
On Wed, Oct 21, 2020 at 3:48 PM PouletAxel notifications@github.com wrote:
I will check that,
can you re run the program and try with 10kb resolution like in their paper? -res 10000 -mat 1000 -t 1500 -fdr 0.05
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PouletAxel/SIP/issues/9#issuecomment-713833913, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ACUMZSOCAPDTA3XZSYICI43SL43HHANCNFSM4S2C3VYQ
.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PouletAxel/SIP/issues/9#issuecomment-713842936, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABS4UBISJ53L3S3G7D6WAI3SL45KNANCNFSM4S2C3VYQ .
Hi, Sorry for the delay of my answer I was working on other stuff. So the problem here is teh value in the mcool are really low, So most of them a close to 0 and will be black during the image processing. I put a too small factor to correct that. It was enough for all the data set I tested but to small for this one. You can wait my release. Or you can on the file made by SIP multiply the tow last columns by 100 or 1000.
best Axel
See traceback...
`Traceback (most recent call last): File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'chr19_KI270915v1_alt'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/cooler/util.py", line 167, in parse_region clen = chromsizes[chrom] if chromsizes is not None else None File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/pandas/core/series.py", line 882, in getitem return self._get_value(key) File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/pandas/core/series.py", line 989, in _get_value loc = self.index.get_loc(label) File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc raise KeyError(key) from err KeyError: 'chr19_KI270915v1_alt'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/adadiehl/.conda/envs/cooler_env/bin/cooler", line 10, in
sys.exit(cli())
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(args, *kwargs)
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/cooler/cli/_util.py", line 200, in decorated
func(args, **kwargs)
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/cooler/cli/dump.py", line 379, in dump
h5, c._chromids, parse_region(range, c.chromsizes), binsize=c.binsize
File "/home/adadiehl/.conda/envs/cooler_env/lib/python3.7/site-packages/cooler/util.py", line 169, in parse_region
raise ValueError("Unknown sequence label: {}".format(chrom))
ValueError: Unknown sequence label: chr19_KI270915v1_alt
`
Filtering the offending records from the chromsizes file resolved the issue.