Error on train 0 model #34

Open OmarAshkar opened 1 year ago

OmarAshkar commented 1 year ago

Hello, I am trying to use the tool. I run "Make patches", but the next step fails with

ERROR: train_autoencoder (job 18) failed. 

I have build with docker container.

Any help is appreciated.

Best Regards, Omar

nanli-emory commented 1 year ago

Hi @OmarAshkar, could you provide more information about your run environments such as the version of OS, python, and docker? what GPUs do you use? It will help us to narrow down the issues. Thank you.

VolodymyrChapman commented 11 months ago

TL;DR Downgrade protobuf to 3.20.x or below I recently built the environment from the CUDA 11 requirements.txt on Ubuntu 22.04 and this same step failed due to protobuf (cannot find the exact reason for the error now) but the associated guidance in the error was to downgrade protobuf to version <=3.20.x . I performed: $ pip install protobuf==3.20.0 After which, the environment worked in all functions (training model 0, embedding, superpixel generation and retraining of DL). My working environment:

