bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads
GNU General Public License v3.0
109 stars 21 forks source link

Trained Models and Yeast? #72

Closed JayVaz18 closed 3 years ago

JayVaz18 commented 3 years ago

Hi,

I am interested in using the tool you've developed however I see you have specific trained models. Do you know if these models would work with Yeast organism?

PengNi commented 3 years ago

Hi @JayVaz18 , what type of methylation you intend to call? Using pre-trained 5mC model, deepsignal achieves high performance in 5mCpG detection from human and other species (such as arabidopsis). However, the pre-trained 6mA model may not work well enough.

Best, Peng

JayVaz18 commented 3 years ago

Hi @JayVaz18 , what type of methylation you intend to call? Using pre-trained 5mC model, deepsignal achieves high performance in 5mCpG detection from human and other species (such as arabidopsis). However, the pre-trained 6mA model may not work well enough.

Best, Peng

Thanks for getting back to me. I'm trying to call for 5mC (specifically really just methylation at base C) and would like to analyze which reads contains methylation as well as which positions within each read.

PengNi commented 3 years ago

@JayVaz18 , besides deepsignal, I recommand our deepsignal2, and Megalodon from ONT for more accurate 5mCpG detection.

Best, Peng

JayVaz18 commented 3 years ago

Ok thanks. I actually tried using deepsignal2. I can't seem to figure out why I keep getting this error

[14:19:34] Preparing reads and extracting read identifiers.
/Users/jakevazquez18/anaconda3/envs/py36/lib/python3.6/site-packages/tombo/_preprocess.py:378: H5pyDeprecationWarning: The default file mode will change to 'r' (read-only) in h5py 3.0. To suppress this warning, pass the mode you need to h5py.File(), or set the global default h5.get_config().default_file_mode, or set the environment variable H5PY_DEFAULT_READONLY=1. Available modes are: 'r', 'r+', 'w', 'w-'/'x', 'a'. See the docs for details.
  with h5py.File(fast5_fn) as fast5_data:
****** WARNING ****** Basecalls exsit in specified slot for some reads. Set --overwrite option to overwrite these basecalls.
100%|█████████████████████████████████████████████| 2/2 [00:00<00:00, 19.35it/s]
[14:19:35] Annotating FAST5s with sequence from FASTQs.
****** WARNING ****** Some FASTQ records contain read identifiers not found in any FAST5 files or sequencing summary files.
0it [00:00, ?it/s]
[14:19:35] Added sequences to a total of 0 reads.

After running the guppy_basecall command as provided in the README and running the combo preprocess command as instructed.