Open BenxiaHu opened 3 years ago
Hello, I just read your paper about silencer database, and noticed that you implemented CNN to predict silencers. Would you like to explain how to run
DeepSilencer
? What file doesDeepSilencer
take as input? Thanks in advance. Best,
did you find out how it runs?
We are sorry for the confusing. DeepSilencer is constituted by a CNN model and an ANN model. The inputs of the CNN model are the one-hot encoded matrices of sequences and the inputs of the ANN model are the vectors of k-mer counts in sequences. We provided ‘train_mat.hkl’ and ‘test_mat.hkl’ as the demo input files in the ‘data’ folder, which contain 3200 DNA sequences and 800 DNA sequences, respectively. These sequences are of fixed length (200) and transformed into the one-hot matrices. For the model training and the prediction of DNA sequences with fixed length, you can run ‘run_self_projection.py’, and for the prediction of DNA sequences with variable length, you can see the details of our model settings in ‘run_crossdata_projection_human.py’ or ‘run_crossdata_projection_mouse.py’.
Hello, thanks for your kind explanation. I still have one question: ‘train_mat.hkl’ and ‘test_mat.hkl’ . here is what the train_mat.hkl looks like. Would you like to tell me how to make this matrix?
Best,
The step how to make the matrix of "train_mat.hkl" can be devided into two step: 1) we transform the 200bp sequence of ATCG to the matrix of (4,200) shape via the one-hot encoding, 2) and then squeeze the matrix to a factor with the length of 800.
hello, thanks for your explanation. Why do you squeeze the matrix to a factor with the length of 800?
Just for data storage. We then reshape the factor to (4,200,1) in the preprocessing step.
Hello, I just read your paper about silencer database, and noticed that you implemented CNN to predict silencers. Would you like to explain how to run
DeepSilencer
? What file doesDeepSilencer
take as input? Thanks in advance. Best,