nanopore-wgs-consortium / NA12878

Data and analysis for NA12878 genome on nanopore
Other
374 stars 93 forks source link

fast5's of some RNA reads corrupted? #24

Open cvdelannoy opened 6 years ago

cvdelannoy commented 6 years ago

Some of the fast5's of the RNA reads do not seem to be readable. I downloaded the reads of the 5 Bham runs (using wget, I don't know if that matters?). My python scripts (using h5py library) return "OSError: Unable to open file (file signature not found)". hdfview 2.13.0 also fails to open them (java.io.IOException: Unsupported fileformat). re-downloading them doesn't solve the issue and many other fast5's of the same set read just fine. I've attached a list of some reads for which this was the case, but there are more (the quick&dirty script that encountered them just wrote error messages to screen from where I copy-pasted this list). I'll try to get a complete list of all the reads that seem unreadable.

Are these actually corruped or am I doing something wrong? na12878_rnaReads_failed.txt

nickloman commented 6 years ago

It's pretty common to get some unreadable FAST5s generated by the system. I will check the source files just to ensure there wasn't a transfer error.