Closed johnSmith1990 closed 4 years ago
Hi There!
Thank you!
It appears you don't have cityscapesscripts installed, it can be installed with pip.
python -m pip install cityscapesscripts
The GitHub repo for the cityscapesscripts is below with more instructions on using the dataset and understanding its directory structure and API.
https://github.com/mcordts/cityscapesScripts
Thanks Joel
thanks dear Joel. I am going to customize your project. It uses many networks but i want to use fast-scnn not using horovod. regards, John
No worries, good luck.
Although I will say, if you are doing Multi-GPU and you're doing HPC, Horovod is an excellent solution as it can utilise the resources provided by Job Schedulers like Slurm rather well.
If you don't have any further questions, let me know and I'll close the ticket.
Thanks Joel
thanks. May you share a code for training fast-scnn on single gpu? There are some sample code of model but they dont have script for train and test.
regards, John
Hi John
This repo is my code for a single GPU or more, for training (train.py) and testing (predict.py).
There are plenty of code examples of training models via cityscapes dataset via different models all over the web through PyTorch and TensorFlow (some I have even referenced in my own repo).
Please research and refer to those, as well as my own repo to create and train your model on cityscapes.
Thank you Joel
horovod doesnt install sucsessfully. It gives this error :
ERROR: Command errored out with exit status 1: /home/amin/anaconda3/envs/tensor2_env/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3tr9w6at/horovod/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3tr9w6at/horovod/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-mrjkw7mu/install-record.txt --single-version-externally-managed --compile --install-headers /home/amin/anaconda3/envs/tensor2_env/include/python3.7m/horovod Check the logs for full command output.
Cany you share pretrain model to test FPS of the model? I want to train model on my custom dataset.
What is the values of hvd.rank and hvd.size for single gpu? I cant install horovod.
Cany you share pretrain model to test FPS of the model? I want to train model on my custom dataset.
Please look at my repo completely before asking questions, I have weights for the training inside my results directory.
horovod doesnt install sucsessfully. It gives this error :
ERROR: Command errored out with exit status 1: /home/amin/anaconda3/envs/tensor2_env/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-3tr9w6at/horovod/setup.py'"'"'; __file__='"'"'/tmp/pip-install-3tr9w6at/horovod/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-mrjkw7mu/install-record.txt --single-version-externally-managed --compile --install-headers /home/amin/anaconda3/envs/tensor2_env/include/python3.7m/horovod Check the logs for full command output.
I won't be able to help you install horovod as it depends on what your needs are, I installed MPI for my configuration but you may not need that, please refer to here: https://horovod.readthedocs.io/en/stable/install_include.html and here: https://github.com/horovod/horovod You may need to ensure a compatible g++ compiler is installed.
What is the values of hvd.rank and hvd.size for single gpu? I cant install horovod.
Please refer to the API documentation, that is why they exist https://horovod.readthedocs.io/en/stable/api.html
hvd.size() should be 1, hvd.rank depends on the process (but for a single GPU will likely only produce 0)
If you do your code right...it shouldn't matter what the size is, your training regime should work no matter the size given by hvd.size() (except for being verbose with epochs and output into the console, that should be done by a single rank).
I am a very busy person (multiple projects), I unfortunately do not have the time to solve every problem you have with understanding everything, please refer to Horovod and TensorFlow documentation to understand functions and to solve installation issues.
Please note that all of these packages have likely updated and you will likely need to adjust my code to make it compatible with whatever versions you may or may not have.
If you have any more questions, directly related to my repo, then please ask, but I cannot help you with questions figure out your own code and installation issues that you have.
Thank you Joel
thank you. If you cant help anyone using your repo, why you share your code? for time wasting? please close issue and delete your repo to don't waste others time anymore.
regards.
thank you. If you cant help anyone using your repo, why you share your code? for time wasting? please close issue and delete your repo to don't waste others time anymore.
regards.
I share it so people can use the code, and to use my training code, you are not the only one who used the code and asked me questions. I have helped make tutorials from my code in an organization, through feedback of using my code in their system.
Please do not accuse me of wasting people's time, that is very rude, and unappreciated. You have asked questions that are beyond the scope of GitHub issues which are reserved for bugs and feature requests, or changes needed to the repo, or issues in using my code rather than providing full answers on how to install packages and answers that can be found within the API or documentation of said packages you are installing.
Please respect your colleagues time and efforts, by also going through the same toil as I did by reading and researching the packages you are using before using them.
I have 3 separate projects and a day job at this current time, and I have taken what little time I have of my personal time to help you, and you have responded to me rudely.
I will not be deleting my repo as others are finding it useful and have forked it, and to others who are reading this, I want to ensure, that I take every issue seriously.
If you make a post with code that includes my own and have issues with running it then I will help you in anyway that I can.
Kind Regards Joel
Hi dear Thanks for your great project. I wanna to train the network with cityscape dataset but got this error: