Closed pcjedi closed 6 years ago
The tfrecords file is generated by raw.py, if you follow the steps described in README.md.
On Tue., 3 Jul. 2018, 1:50 am pcjedi, notifications@github.com wrote:
I found the supporting data at gigadb.org ( http://gigadb.org/dataset/100425, which is falsely linked at academic.oup.com, by the way). I find signal and label data, but a mandatory tfrecords file appears to be missing, or can I use any tfrecord file?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haotianteng/Chiron/issues/63, or mute the thread https://github.com/notifications/unsubscribe-auth/AKo3Xx9ODaxuRjwzs7b0zdTy9atCKl2-ks5uCpWmgaJpZM4U__eo .
Thanks, but I wanted to train Chiron with the 'original' trainingdata provided by gigadb. The signal and label data provided via gigadb appears to be useless unless the corresponding tfrecords file is also provided.
Oh yes that's correct as Chiron no longer use the signal and label file format, you can do it in two ways, one is to use the older version of Chiron, e.g. 0.3, or you can download the original fast5 files from https://data.genomicsresearch.org/Projects/train_set_all
On Tue., 3 Jul. 2018, 7:08 am pcjedi, notifications@github.com wrote:
Thanks, but I wanted to train Chiron with the 'original' trainingdata provided by gigadb. The signal and label data provided via gigadb appears to be useless unless the corresponding tfrecords file is also provided.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/haotianteng/Chiron/issues/63#issuecomment-402023773, or mute the thread https://github.com/notifications/unsubscribe-auth/AKo3X4pHeeO6-al1RT6JMcJvQ3kN_H8eks5uCwpngaJpZM4U__eo .
Hi @haotianteng. Wanted to know if the model provided in chiron/model/DNA_Default was entirely trained using the dataset in http://gigadb.org/dataset/100425 or if any additional training sets were used. If yes, could me point me to the same resources?
Thanks, Naga.
Chiron V0.3 is entirely trained using the dataset described in the paper. No additional training sets were used. Chiron V0.4, on the other hands, use an additional human dataset to make it perform better on Human dataset.
I found the supporting data at gigadb.org (http://gigadb.org/dataset/100425, which is falsely linked at academic.oup.com, by the way). I find signal and label data, but a mandatory tfrecords file appears to be missing, or can I use any tfrecord file?