bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads
GNU General Public License v3.0
108 stars 21 forks source link

running "deepsignal extract" without tombo.index #9

Closed Puputnik closed 5 years ago

Puputnik commented 5 years ago

Hi!

I'm running a large analysis on DNA data (about 15,000,000 reads) I was running tombo resquiggle when my server suddenly crashed and only half of the reads had been correctly resquiggled. I checked which of the reads had been resquiggled through a custom python script using h5py and I relaunched tombo resquiggle on the others. However, for the reads that had been correctly resquiggled before the server crashed no tombo.index was generated. Since it took almost 3 days to resquiggle those reads it would be painful to re-run the resquiggle on the entire dataset and I was wondering if the tombo.index is actually required for running deepsignal extract (and downstream steps) or If it is not essential

Thanks a lot in advance

PengNi commented 5 years ago

Hi @Puputnik ,

Thanks for your interest. The extract module does NOT need tombo.index. It only needs the re-squiggled fast5 files.

Also, if you only want to call modifications with the pre-trained model, it is suggested that calling modifications from the resquiggled fast5 files directly, without the _featureextraction step, to save disk space.

Best, Peng