Unable to run DeepNovo locally #23

BioGeek commented 2 weeks ago

When I run the benchmark locally with with ./run.sh ./sample_data/9_species_human , I get the following error for DeepNovo:

Output file: ./outputs/9_species_human/deepnovo_output.csv
Processing algorithm: deepnovo
Processing file: 9_species_human/151009_exo4_1.mgf
100%|##################################################################################################################| 11016/11016 [00:02<00:00, 5381.66it/s]
11016 spectra written to ./input_data.mgf.
vocab_reverse  ['_PAD', '_GO', '_EOS', 'A', 'R', 'N', 'Nmod', 'D', 'Cmod', 'E', 'Q', 'Qmod', 'G', 'H', 'I', 'L', 'K', 'M', 'Mmod', 'F', 'P', 'S', 'T', 'W', 'Y', 'V']
vocab  {'_GO': 1, '_EOS': 2, '_PAD': 0, 'Mmod': 18, 'A': 3, 'E': 9, 'D': 7, 'G': 12, 'F': 19, 'I': 14, 'H': 13, 'K': 16, 'M': 17, 'L': 15, 'Nmod': 6, 'N': 5, 'Q': 10, 'P': 20, 'S': 21, 'R': 4, 'T': 22, 'W': 23, 'V': 25, 'Y': 24, 'Cmod': 8, 'Qmod': 11}
vocab_size  26
_buckets  [12, 22, 32]
num_ion  8
l2_loss_weight  0.0
embedding_size  512
num_layers  1
num_units  512
keep_conv  0.75
keep_dense  0.5
batch_size  128
epoch_stop  20
train_stack_size  4500
valid_stack_size  15000
test_stack_size  4000
buffer_size  4000
steps_per_checkpoint  100
random_test_batches  10
max_gradient_norm  5.0
2024-09-21 01:15:21.593487: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2024-09-21 01:15:21.593535: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2024-09-21 01:15:21.593547: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2024-09-21 01:15:21.593557: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2024-09-21 01:15:21.593589: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2024-09-21 01:15:23.091204: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-09-21 01:15:23.091293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: NVIDIA GeForce RTX 4070 Laptop GPU
major: 8 minor: 9 memoryClockRate (GHz) 1.23
pciBusID 0000:01:00.0
Total memory: 7.76GiB
Free memory: 7.62GiB
2024-09-21 01:15:23.091305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 
2024-09-21 01:15:23.091308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y 
2024-09-21 01:15:23.091317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA GeForce RTX 4070 Laptop GPU, pci bus id: 0000:01:00.0)
ModelInference: __init__()
ModelNetwork: __init__()
ModelInference: build_model()
ModelNetwork: build_network()
ModelNetwork: _build_cnn_spectrum()
ModelNetwork: _build_embedding_AAid()
ModelNetwork: _build_cnn_ion()
ModelNetwork: _build_lstm()
ModelNetwork: _build_cnn_ion()
ModelNetwork: _build_lstm()
ModelInference: restore_model()
restore model from ./translate.ckpt-48600
inspect_file_location(), input_file =  ./input_data.mgf
Total number of spectra = 11016
Load knapsack_matrix from default: knapsack.npy
READ & DECODE in stacks
Traceback (most recent call last):
  File "DeepNovo/deepnovo_main.py", line 83, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "DeepNovo/deepnovo_main.py", line 35, in main
  File "/algo/DeepNovo/deepnovo_main_modules.py", line 2343, in decode
  File "/algo/DeepNovo/deepnovo_main_modules.py", line 115, in read_spectra
    peptide_ion_mz = float(re.split("=|\n", line)[1])
ValueError: invalid literal for float(): 509.725524902 238250.682373
Traceback (most recent call last):
  File "output_mapper.py", line 71, in <module>
    output_data = output_mapper.format_output(output_data)
  File "/algo/base/output_mapper.py", line 129, in format_output
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 3370, in __setitem__
    self._set_item(key, value)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 3444, in _set_item
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 3426, in _ensure_valid_index
    raise ValueError('Cannot set a frame with no defined index '
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series