Open Deleetdk opened 2 years ago
Hi,
As you are using cram
formatted file, could you please try with reference genome and let us know the problem persist or not.
cnvpytor -root file.pytor -rd NG1IL0F60J.cram -T <path for reference fasta file>
Thanks Arijit
I've let it run for 6+ hours so far, seems it is stuck:
cnvpytor -root file.pytor -rd NG1IL0F60J.cram -T /data/genomics/reference_files/hg38.fa
2022-08-29 01:34:53,231 - cnvpytor.bam - INFO - File: NG1IL0F60J.cram successfully open
2022-08-29 01:34:53,232 - cnvpytor.bam - INFO - Detected reference genome: hg38
2022-08-29 01:34:53,236 - cnvpytor.pool - INFO - Parallel processing using 8 cores
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr2 with length 242193529
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr1 with length 248956422
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr3 with length 198295559
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr4 with length 190214555
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr5 with length 181538259
2022-08-29 01:34:53,248 - cnvpytor.root - INFO - Reading data for chromosome chr6 with length 170805979
2022-08-29 01:34:53,249 - cnvpytor.root - INFO - Reading data for chromosome chr7 with length 159345973
2022-08-29 01:34:53,249 - cnvpytor.root - INFO - Reading data for chromosome chr8 with length 145138636
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 2 pos 16776232..16814735
[E::cram_decode_slice] CRAM: 94982bbafd95a1a748fa20098fa90785
[E::cram_decode_slice] Ref : f16024284d657779afbaff7aeafdee31
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:13,239 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 1 pos 20944307..20980664
[E::cram_decode_slice] CRAM: d6684283cb67e862e1f3c1a612609f28
[E::cram_decode_slice] Ref : 14de7a8fa2a63952b296c2f2457ac77c
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:18,101 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 4 pos 47308302..49599821
[E::cram_decode_slice] CRAM: c4a1a9cc77b233653ed493122ac7d8f6
[E::cram_decode_slice] Ref : a0079071eceaa2b978aec2dbe12e9744
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:35:47,822 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 5 pos 61323012..61378416
[E::cram_decode_slice] CRAM: f6e90e6d1513b16c3f6b27b12470e858
[E::cram_decode_slice] Ref : d31242f5e8ad326d470866fa93641452
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:36:06,459 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
2022-08-29 01:37:41,411 - cnvpytor.root - INFO - Reading data for chromosome chr9 with length 138394717
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 6 pos 154558071..154597614
[E::cram_decode_slice] CRAM: 828f186956ba7548c8cb2a5cf3f58d95
[E::cram_decode_slice] Ref : 4e7e0d006cfebccf4945eba9a783830e
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:37:51,781 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
2022-08-29 01:38:35,198 - cnvpytor.root - INFO - Reading data for chromosome chr10 with length 133797422
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 8 pos 89873013..89913176
[E::cram_decode_slice] CRAM: 64c0251ef044d4a1f07ddb3c6091ff65
[E::cram_decode_slice] Ref : de90d4be921385c7764f1156bca9e3fb
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:05,648 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 9 pos 39224987..39261161
[E::cram_decode_slice] CRAM: 1db319bc13d26df6c359d41e6304a51a
[E::cram_decode_slice] Ref : be226a28220ba1ae4633caf36a971ffb
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:19,863 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 0 pos 248745489..248787439
[E::cram_decode_slice] CRAM: a2844d84c77fd8b4f48ae633c2d0f962
[E::cram_decode_slice] Ref : 20c6c56d70d5fd59b58aa111d0dcccd1
[E::cram_next_slice] Failure to decode slice
2022-08-29 01:39:40,503 - cnvpytor.bam - ERROR - Error while reading file 'NG1IL0F60J.cram'
You were right, the protocol errors disappeared.
It seems that reference used to create CRAM file is not the same as hg38.fa
reference you provided in command line. Please, check assembly version in CRAM header.
Working with my own genome in CRAM format, I installed using pip as local user. Installation and downloading of files proceeded without issues. However, run time produced errors:
I see nothing reported here or on Google. The URLs work fine, e.g. https://www.ebi.ac.uk/ena/cram/md5/6aef897c3d6ff0c78aff06ac189178dd opens OK in the browser though of course is not readable by humans (binary).
Any ideas? I will try the Github version.
My CRAM file is here: https://filedn.eu/lCyoUMpONNB7afAi4dJTUyX/data/genomics/personal%20genomes/emil/