RS-DAT / JupyterDaskOnSLURM

Apache License 2.0
16 stars 3 forks source link

Issue with `copy_folder` and `install_JD` #79

Open SarahAlidoost opened 3 months ago

SarahAlidoost commented 3 months ago

Currently, in the function install_JD:

if not folder_exists:
        print ('Cloning JupyterDaskonSLURM on remote host...')
        utils.ssh_remote_executor(config_inputs, copy_folder, config_inputs)

while the function copy_folder copies a folder instead of cloning. So, the print statement needs a fix. Also, in the function copy_folder, a relative path ../JupyterDaskOnSLURM is used which means first cloning the repo and cd to the directory. Why the whole folder JupyterDaskOnSLURM on remote is needed? can we just copy the scripts that are needed?

fnattino commented 2 months ago

Good point @SarahAlidoost . I believe that the reason why the repository is currently copied to the remote (and not cloned) is that in this way you can modify the scripts locally before uploading them to the platform. But this allows you to modify these scripts only when installing, you will be bound to use these scripts for all time you will be running jupyter..

Maybe a better way to do this could be:

What do you think? When I am back I could try to implement these changes and open a PR towards fix_71, so that #75 will then solve all these at once.

fnattino commented 2 months ago

Hi @SarahAlidoost and @rogerkuou sorry but I could not get that far in what I indended to do on updating the script to work as I was trying to describe in the previous message. The discussion with @rogerkuou made me think that maybe we should not modify too much the way in which Dask configuration is provided, i.e. via config files, so maybe we should not implement passing parameters via environment variables (see third point above).

I still think we should avoid having to clone the repository on the remote - maybe we could even drop the installation part from the Python script? I have the impression we have been the only ones using it in the courses, while the other people that have been using RS-DAT have preferred to go through the manual installation.. Also, seeing how things have evolved for Spider, maybe we should just give more flexibility on which environment should be used and avoid pushing the use of conda. See also #82.

Maybe worth it to discuss it briefly together once we are all back?