Closed Bartvelp closed 4 years ago
You could try adding a print message around line 94-97 of seq2tfrec_onehot.py to make sure that a training set (rather than a test set) is being converted.
By the way, if you only want to reproduce 16S prediction using the seq2species model. The original implementation by google might be helpful:
https://github.com/tensorflow/models/tree/master/research/seq2species
Yes I already made sure it is converted to a training set with the convert_advance_file
function and that function correctly extracts the information.
Turns out the input_fn_train
is set depending on the --encode_method flag which I failed to set, it default to kmer which is of course wrong. Setting --encode_method to one_hot fixes the TFrecord parsing, and the training starts succesfully.
Calculation the loss seems to fail however and I am not sure what is causing it. I am getting this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to explicitly squeeze dimension 1 but dimension was not 1: 0
[[Node: sparse_softmax_cross_entropy_loss/remove_squeezable_dimensions/Squeeze = Squeeze[T=DT_INT64, squeeze_dims=[-1], _device="/job:localhost/replica:0/task:0/device:CPU:0"](IteratorGetNext:1)]]
Any idea on how to fix this/what causes this?
P.S. I found the original paper and code to be very convoluted and difficult to work with, and I am interest in also trying the other models in this repo.
I'm not really sure about the solution. But I think the problem lies in the training data (e.g., the length of DNA sequences) rather than the model. The model needs a flag of --max_len whose default value is 150 bp. Try setting it to the max length of your full-length 16S data.
Oops I see I forgot to add the command I ran
DeepMicrobes.py --input_tfrec=combined_train_small.tfrec --model_name=seq2species --model_dir=seq2species_new_weights_small --max_len=400 --encode_method=one_hot
(I trimmed the sequences to 400bp).
So that should not be the problem. When I did forget to set the --max_len I get an error about padding to a lower size than the original.
Try deleting the model_dir (rm -rf seq2species_new_weights_small) and running again.
Still no luck unfortunatly Log This is my repo if you are puzzeled by the print statements github.com/Bartvelp/DeepMicrobes_clone
You should set the --num_classes flag to your actual number of categories. The default value is --num_classes=2505 (I had 2505 species for the pre-trained model).
yes thank you I forgot that.
I figured it out, due to a weird bug or something my tfrecord file did not contain the classes/labels. When I recreated them it all worked out-of-the-box. Thanks alot for your help! Closing
I have created a labelled fasta file based on the refseq (full-length) 16s rrna database like so:
I then converted this file to a TFrecord using this command
seq2tfrec_onehot.py --input_seq=../combined_train_labelled.fa --output_tfrec=../combined_train.tfrec --is_train=True
Then why I try to train the seq2species model I get the following error:
Click to expand
``` (DeepMicrobes) bart@Bart-HP-PAV14:~/DeepMicrobes$ DeepMicrobes.py --input_tfrec=combined_train.tfrec --model_name=seq2species --model_dir=seq2species_new_weights --max_len=100 2020-05-30 17:36:23.820149: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA RUNNING MODE: train I0530 17:36:23.822940 140213195958080 tf_logging.py:115] Using config: {'_model_dir': 'seq2species_new_weights', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100000, '_save_checkpoints_secs': None, '_session_config': None, '_keep_checkpoint_max': 1000, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec':If I add the following to remedy this error:
in
seq2species.py
in the__call__
function the model seems to compile but crashes with the following error eventually:Any help would be greatly appreciated!