google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
222 stars 37 forks source link

Docker - tensorflow error #29

Closed AMMMachado closed 2 years ago

AMMMachado commented 2 years ago

Hi @MariaNattestad,

In the last months, I have used the deepconsensus v0.2.0 without problems. Now I intend to move to the 0.3.0 version. The docker image seems to work ok. I downloaded the model directly from the git ./deepconsensus/testdata/model.

When I run the deepconsensus the process stop with the error below:

I0708 11:35:33.810553 140663252805440 quick_inference.py:727] Using multiprocessing: cpus is 30.
I0708 11:35:33.820718 140663252805440 quick_inference.py:459] Loading model/checkpoint
2022-07-08 11:35:33.827062: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 
I0708 11:35:33.948971 140663252805440 networks.py:358] Condensing input.
2022-07-08 11:35:34.684176: W tensorflow/core/util/tensor_slice_reader.cc:96] Could not open model/checkpoint: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/py_checkpoint_reader.py", line 92, in NewCheckpointReader
    return CheckpointReader(compat.as_bytes(filepattern))
RuntimeError: Unable to open table file model/checkpoint: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/deepconsensus", line 8, in <module>
    sys.exit(run())
  File "/usr/local/lib/python3.8/dist-packages/deepconsensus/cli.py", line 111, in run
    app.run(main, flags_parser=parse_flags)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.8/dist-packages/deepconsensus/cli.py", line 102, in main
    app.run(quick_inference.main, argv=passed)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.8/dist-packages/deepconsensus/inference/quick_inference.py", line 814, in main
    outcome_counter = run()
  File "/usr/local/lib/python3.8/dist-packages/deepconsensus/inference/quick_inference.py", line 734, in run
    loaded_model, model_params = initialize_model(
  File "/usr/local/lib/python3.8/dist-packages/deepconsensus/inference/quick_inference.py", line 476, in initialize_model
    checkpoint.restore(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/util.py", line 2537, in restore
    status = self.read(save_path, options=options)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/util.py", line 2417, in read
    result = self._saver.restore(save_path=save_path, options=options)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/util.py", line 1423, in restore
    reader = py_checkpoint_reader.NewCheckpointReader(save_path)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/py_checkpoint_reader.py", line 96, in NewCheckpointReader
    error_translator(e)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/training/py_checkpoint_reader.py", line 40, in error_translator
    raise errors_impl.DataLossError(None, None, error_message)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file model/checkpoint: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

Can you help me with the error?

Best Regards André

AMMMachado commented 2 years ago

Hi again @MariaNattestad,

I found that if i use the model downloaded with link below, the program work perfectly.

gsutil cp -r gs://brain-genomics-public/research/deepconsensus/models/v0.3/model_checkpoint/* "${QS_DIR}"/model/

Should i use this model ?

If you confirm that, please Feel free to close this issue.

Best Regards André

kishwarshafin commented 2 years ago

Hi @AMMMachado ,

Yes, the correct way to download the model is from the GCP bucket. As github has maximum size limitation, we don't have the full model of v0.3 available on github. I'll close this case as this has been resolved.