gml16 / rl-medical

Communicative Multiagent Deep Reinforcement Learning for Anatomical Landmark Detection using PyTorch.
https://arxiv.org/abs/2008.08055
Apache License 2.0
81 stars 21 forks source link

What is the recommended python version? #11

Closed Vathsan closed 1 year ago

Vathsan commented 1 year ago

I am trying to train a model. Due to some version mismatch I run into some import errors. I am using python 3.8.2. Is there a specific python version that I should be using?

Traceback (most recent call last):
  File "DQN.py", line 8, in <module>
    from logger import Logger
  File "/scratch/sshanmug/rl-medical/src/logger.py", line 6, in <module>
    from torch.utils.tensorboard import SummaryWriter
  File "/scratch/sshanmug/rl-medical/env/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py", line 12, in <module>
    from .writer import FileWriter, SummaryWriter  # noqa: F401
  File "/scratch/sshanmug/rl-medical/env/lib/python3.8/site-packages/torch/utils/tensorboard/writer.py", line 9, in <module>
    from tensorboard.compat.proto.event_pb2 import SessionLog
  File "/scratch/sshanmug/rl-medical/env/lib/python3.8/site-packages/tensorboard/compat/proto/event_pb2.py", line 6, in <module>
    from google.protobuf import descriptor as _descriptor
  File "/scratch/sshanmug/rl-medical/env/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 51, in <module>
    from google.protobuf.pyext import _message
ImportError: /scratch/sshanmug/rl-medical/env/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN4absl12lts_2023012512log_internal9kCharNullE
gml16 commented 1 year ago

Hello, have you tried to run the code through the conda environment? If not, you can create it with conda env create -f environment.yml. The Yaml file is provided in the repo. It installs Python 3.8.5 along with all the necessary dependencies. Please let me know it you encounter any issue with the Conda environment.

Vathsan commented 1 year ago

Hey @gml16, thanks for the response. I am using a pip environment. I was able to train and evaluate the models after tweaking some of the library versions (protobuf and opencv-python).

My goal is to predict a set of landmarks in echocardiography images (ultrasound of heart). I can see that you have used 72 3D fetal head ultrasound images to train the model. Do you have any suggestion on the number of images I should be using? My primary goal is to identify only 3 landmarks. I would also appreciate any tips on how I can reduce the training time. Thanks!

gml16 commented 1 year ago

Glad to hear the training is now working successfully. As to the number of images, the more the better. If the images follow a similar distribution you won’t need as many as if, for example, the patients suffer different heart conditions, or different kinds of scanner captured the images. It’s not an exact science. You can retrain with different subset sizes of your dataset and extrapolate the accuracy improvements of more data. Hope that helps :)

gml16 commented 1 year ago

To answer how to reduce training time, you can either use a faster CPU or GPU. If you need big improvements, you can implement a multithread version of stepping the environment to collect data faster. I believe this is the bottleneck at the moment but you could make sure by profiling the code. Pull requests are more than welcome.

Vathsan commented 1 year ago

Thank you very much for the details. This really helps. I am looking forward to see how this works for the echo data.