Closed Bartvelp closed 4 years ago
Hello Bart,
The mapping from index for the prob matrix to class was provided in a tab-file. Here is the mapping file for the pre-trained species model: (the 1st column holds the species names and the 2nd column holds the corresponding indexes).
https://github.com/MicrobeLab/DeepMicrobes/blob/master/data/name2label_species.txt
Note that you might have to train you own model according to your need. This pre-trained model is specific to the context of the paper.
For single-end reads, you could first convert them to interleaved reverse-complement form using seqtk. But this is optional. If you do this before TFRecord conversion, you could edit the line of (-1, 4, num_classes) of format_prediction.py by changing 4 to 2. And if not, you should change 4 to 1.
Thanks a lot for your swift response! I understand now and it is working now. But as you suggested I should retrain the model. Which I again am experiencing some issues with, but I will open a new issue for it.
First off, thank you for your very nice paper, really interesting results! I am trying to reproduce your results, but I am getting stuck at the prediction using the Seq2Species model. I have installed DeepMicrobes and am trying to predict the species of 100bp fasta file (which contains 16s rRNA of E.coli).I do not have paired end reads.These are the commands I currently run:
(DeepMicrobes) bart@Bart-HP-PAV14:~/DeepMicrobes/pipelines$ seq2tfrec_onehot.py --input_seq=../test_fasta_100bp.fa --output_tfrec=../temp.onehot.tfrec --is_train=False --seq_type=fasta
Which seems to run fine and then:(DeepMicrobes) bart@Bart-HP-PAV14:~/DeepMicrobes/pipelines$ ./predict_seq2species.sh -i ../temp.onehot.tfrec -p 1 -m ../weights_seq2species/ -o test_output
But this gives the following error:This reshape also does not make sense to me. I would just like the probability for each class. It seems the probabilities are held in
prob_matrix[0]
but I don't know which index corresponds to which class (species). Any help would greatly be appreciated.