Closed tobbber closed 7 months ago
Sorry for the late reply. It sounds like a great idea. I will take a look at your PR. Thanks for your effort!
Regarding the multiprocessing issue, I guess the reason is the incompatibility with jupyter notebook or Colab. I believe there could be a fix for that, in case you are interested. Anyway having the option to disable multiprocessing is great.
Thanks for taking a look at the PR.
Yes i think having the option to disable multiprocessing would be nice, at least for the standard interface Molscribe.predict_images
.
However, you probably have better judgment on whether multiprocessing should be disabled by default or how to enable/disable it, so feel free to suggest changes.
First off, thanks for the project. I had some issues when using Molscribe with MLServer due to Molscribe using multiprocessing in function
molscribe.chemistry.convert_graph_to_smiles(...)
. (This seems to cause issues because Mlserver also runs Molscribe within a multiprocessing process)However when i forked Molscribe and disabled multiprocessing, i noticed that
molscribe.predict_images
ran faster than with multiprocessing enabled. For small image batches, this difference was especially significant. I was able to reproduce this in a colab notebook.Even though these results might change for very large batch sizes (>128 images), it think it would make sense to make the use of multiprocessing optional and configurable.
This could be done by:
molscribe.Molscribe(model_path, device=None, num_workers=0)
MOLSCRIBE_WORKERS=16
.Feel free to give your feedback to this idea. I would also be happy to provide a pull request.