data problem - Githubissues

bioinfomaticsCSU / deepsignal

Detecting methylation using signal-level features from Nanopore sequencing reads

GNU General Public License v3.0

108 stars 21 forks source link

data problem #75

Closed Xiaoliugogogo closed 2 years ago

Xiaoliugogogo commented 2 years ago

hi PengNi: I want to ask questions about data； I followed the steps described in your paper to download data from the European Nucleotide Archive (ENA), such as accession PRJEB23027. Why do I download the data. The size of tar.gz is 44G, but I think that each fast5 data is only 4K. Is this a problem with the data I downloaded? Or can you provide us with the data you downloaded from ENA for our reference. Whether it is downloaded data needs to go through multi_to_single_fast5 this step

Best zhongyu

PengNi commented 2 years ago

Hi zhongyu, it is the data I used. You can just download them. Each zipped file contains thousands or more fast5s.

Best, Peng

Xiaoliugogogo commented 2 years ago

Hi PengLi

Yes, this data contains thousands or even tens of thousands of data, but the size of each data is only about 4k, so I want to ask if you use the data, each data is also of this size, as I said Say, is each piece of data fast5 independent? Does it need to go through multi_to_single_fast5 this step?
In addition, I am a student of Harbin Institute of Technology, and my direction is also in the direction of bioinformatics. Is it convenient for you to leave a contact information? I want to communicate with you further, or if I leave the contact information, you can also contact me.

Best zhongyu

PengNi commented 2 years ago

@Xiaoliugogogo , according to what I know, the fast5s PRJEB23027 is already in single-read format, there is no need to use multi-to-single process. You can use HDFView to visualize the fast5s, and use other tools (guppy, tombo or else) to test the files.

Best, Peng

PengNi commented 2 years ago

Also, feel free to email or QQ me, we can use Chinese to communicate as we are both Chinese.

Xiaoliugogogo commented 2 years ago

Hi PengLI 我的qq是374696483 email 是374696483@qq.com 感谢您能在百忙之中回复我，后续我们可以通过qq继续交流 Besh zhongyu

Xiaoliugogogo commented 2 years ago

Hi PengLi 麻烦您加我一下qq，我这边看不到您的邮箱或者其他方式 Besh zhongyu

PengNi commented 2 years ago

@Xiaoliugogogo , you can check my github homepage, I believe all related info are in there. Also my name is Peng Ni.

Best, Peng