bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads
GNU General Public License v3.0
109 stars 21 forks source link

Training of a custom model #79

Closed hukai916 closed 2 years ago

hukai916 commented 2 years ago

Hi developers,

Thanks for creating DeepSignal!

For my data, the modified bases don't appear in specific motifs, instead, we have collected ground truth information, can I still use DeepSignal to train a model while leveraging the ground truth information?

Best,

--Kai

PengNi commented 2 years ago

Hi @hukai916 , basicly A new deepsignal model can be trained as long as we have positive samples and negative samples. However, without specific motifs (say we set --motifs as N), the performance may be not guaranteed.

hukai916 commented 2 years ago

Thanks for your quick reply!

Though performance is not guaranteed, I am still interested in testing it out. Seems that the training model part of the documentation is minimal, I also looked at deepsignal train -h, however, it is still quite hard for me to move forward, I encountered the following headaches:

  1. Where is extract_features.py? I saw it is inside deepsignal_plant repo? Is there any usage documentation of it?
  2. What are positive and negative samples? How to prepare and shuffle? Do you have some example?

It will make my life much easier it you can supply some step-by-step tutorial on model training.

Again, thanks for creating deepsignal, it is awesome!

PengNi commented 2 years ago

Hi @hukai916 , you can check previous issues (like #74, #53 ) for more information.

Best, Peng

hukai916 commented 2 years ago

Hi Peng,

This is very informative, I highly recommend adding them into the documentation.

I will let you if I run into other problems. Thanks again!

--Kai