Digital-Humans-23 / a2

4 stars 0 forks source link

Conda works fine on server, but report command not found when submitting job #4

Closed CHENGEZ closed 1 year ago

CHENGEZ commented 1 year ago

Hi,

I have installed conda on the euler server, and it works fine as I can create conda envs and activate them normally.

Below shows the created conda environments: image

But when I follow the instructions in HOWTO.md to submit a job, I always get the following output in the .err file: image

It seems the conda command is not found when called from the job script.

And of course since the environment is not successfully activated, "Module not found" issues also appear in the .out file. Could you please provide some hint on why conda works from the terminal but not from the job script?

MiguelZamoraM commented 1 year ago

I'm also puzzled by this. I believe it depends on the type of group access that each user has on the server. For some people, the conda environment works well, for some other people, it does not.

A practical solution is to use a virtual environment instead of a conda environment. Please see this.

Be aware of disk space limitations, as installing more environments requires more space, you might need to uninstall some of the environments that you have already created, like the simpleEnv.

juhe9842 commented 1 year ago

HI, I have the same error message but my jobs already started to train on the cluster. I don't really get it why it works with this error message .....

Joshua31415 commented 1 year ago

I had the same issue but adding source ~/.bashrc in a line before conda activate pylocoEnv to the job scripts solved it

CHENGEZ commented 1 year ago

I had the same issue but adding source ~/.bashrc in a line before conda activate pylocoEnv to the job scripts solved it

Thank you very much! It does solves it.