I have done a few changes on the repository, here are some highlights of the main changes:
Change the name of the files and folders to be more unix friendly, as a big number of people in the lab are Ubuntu users. Remove the space between words and substitute it with underscores.
Mention the inherited problem of the system in handling a large number of single scale (1 GPU or less). For this reason, it is even more important to take care of the cluster and make a fair use of it.
Added some example on the different ways you can use torch.no_grad().
Mention the possibility of using torchrun to launch Distributed Training. Add links to the official documentation and beginner tutorial.
Add a local link to the guidelines in the README file.
Some things I feel there are missing but I did not have time to add:
[ ] Complete the examples on the example_code folder, like amp_example.py. Also, create some of the files that are mentioned on the text that do not exist like simple_example.py or gpu_data_augmentation.py.
[ ] Change the writing style of the documentation at some parts to sound more like a tutorial, than a formal complaint.
[ ] We can probably have some examples with commonly used models or datasets.
[ ] Add some examples and documentation for MONAI, as some people in the office do make use of this library for their jobs.
I am more than happy to help you with all of this, but I think to scale this a bit more we should have a short chat. I hope you find the changes useful.
I have done a few changes on the repository, here are some highlights of the main changes:
torch.no_grad()
.torchrun
to launch Distributed Training. Add links to the official documentation and beginner tutorial.Some things I feel there are missing but I did not have time to add:
I am more than happy to help you with all of this, but I think to scale this a bit more we should have a short chat. I hope you find the changes useful.