underworldcode / UWGeodynamics

Underworld Geodynamics
Other
81 stars 32 forks source link

docker local mount permission issues: #186

Closed brightson999 closed 4 years ago

brightson999 commented 4 years ago

hi everyone,

I first created the container by command as follow: docker run -d --name UD-cu11 -i -t -p 11.11.11.11:8888:8888 -v /data/underworld/uwgeodynamics:/home/jovyan/workspace underworldcode/uwgeodynamics

Then copy part of the example file from the docker file system into the volume. But jupyter notebooks are read only, jupyter documents can be ran, but it can not write the calculation results to the local disk mounted by docker。 all of files in local folders is read only. Showing: Not enough permissions.

Another small issue: Is there only one way to parallelize: jupyter nbconvert --to python my_script.ipynb and then run the python script as follow: mpirun -np 4 python my_script.py can I set Ipython Clusters in jupyter notebooks to parallelize.

Cheers.

julesghub commented 4 years ago

I suspect the 1st issue is is because your UID, on the host computer, is not 1000. The jovyan user is 1000 in the docker container. If your UID is not 1000 you'll have issues writing to directories with different UID, unless you're in the same group. Try: $ echo $UID

If your UID is NOT 1000 a quick and dirty fix is to make the hostmachine directory /data/underworld/uwgeodynamics have open access. (This might not be suitable depending on your hostmachine). To do that run: $ chmod a+xrw -R /data/underworld/uwgeodynamics

Re: 2nd question. Yes just one way. That is via mpirun -np 4 python my_script.py. ipython clusters is not the same parallisation as running the above. N.B. you can modify the hardward resources used by the docker container with commands explained here https://docs.docker.com/config/containers/resource_constraints/

brightson999 commented 4 years ago

I follow your suggestion. and get some information as follow:

jovyan@bd2c3b412458:~$ echo $UID 1000 jovyan@bd2c3b412458:~$ ls README.md UWGeodynamics cheatsheet development examples install_guides joss test user_guide workspace jovyan@bd2c3b412458:~$ ls -l total 36 -rw-r--r-- 1 jovyan users 991 Mar 26 00:04 README.md drwxr-xr-x 1 jovyan users 4096 Mar 26 00:37 UWGeodynamics drwxr-xr-x 2 jovyan users 4096 Mar 26 00:04 cheatsheet drwxr-xr-x 4 jovyan users 4096 Mar 26 00:04 development drwxr-xr-x 5 jovyan users 4096 Mar 26 00:04 examples drwxr-xr-x 2 jovyan users 4096 Mar 26 00:04 install_guides drwxr-xr-x 2 jovyan users 4096 Mar 26 00:04 joss drwxr-xr-x 3 jovyan users 4096 Mar 26 00:04 test drwxr-xr-x 3 jovyan users 4096 Mar 26 00:04 user_guide drwxrwxrwx 5 1009 1010 125 May 9 23:30 workspace

it is 1000, but doesn't work.

finally I make the hostmachine directory /data/underworld/uwgeodynamics have open access. Now it's ready to run

brightson999 commented 4 years ago

Hi @julesghub

thank you for your suggestion before. but I get anothor error when I test Tutorial_11_Coupling_with_Badlands.py .

jovyan@bd2c3b412458:~/UWGeodynamics/tutorials$ mpirun -np 24 python Tutorial_11_Coupling_with_Badlands.py

it was running for a while. and got some result folders both badlands and UW. but some error as below:

[5]PETSC ERROR: ------------------------------------------------------------------------ [5]PETSC ERROR: Caught signal number 7 BUS: Bus Error, possibly illegal memory access [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [5]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [5]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [5]PETSC ERROR: to get more information on the crash. [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [5]PETSC ERROR: Signal received [5]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [5]PETSC ERROR: Petsc Release Version 3.12.1, Oct, 22, 2019 [5]PETSC ERROR: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 6 Tutorial_11_Coupling_with_Badlands.py on a named bd2c3b412458 by Unknown Mon May 11 08:25:27 2020 [5]PETSC ERROR: Configure options --with-debugging=0 --prefix=/usr/local --COPTFLAGS="-g -O3" --CXXOPTFLAGS="-g -O3" --FOPTFLAGS="-g -O3" --with-zlib=1 --download-hdf5=1 --download-mumps=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-scalapack=1 --download-superlu_dist=1 --useThreads=0 --download-superlu=1 --with-shared-libraries --with-cxx-dialect=C++11 --with-make-np=8 [5]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 5

=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 9424 RUNNING AT bd2c3b412458 = EXIT CODE: 59 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

I know the performance of badlands is not wery well using mpi. serial is quit good. so I don't know whether this reason cause this error or something else.

thanks for your help again.

Cheers

julesghub commented 4 years ago

@brightson999 I don't know the exact reason for this error based on your output. I'd suspect 24 procs is over decomposing the problem. Try fewer procs, 1-4 should be sufficient for this problem.

brightson999 commented 4 years ago

you are right. fewer procs is fine. I will close this issue. thank you again.