valeriupredoi / bgcval2

Package for BGCVal v2.0
3 stars 0 forks source link

Fix auto download permissions #114

Closed ledm closed 8 months ago

ledm commented 9 months ago

closes 111

This is for @DrYool.

Now bgcval2 checks whether you have write permission before attempting to overwrite the download script or change the download directory permissions.

There are two places on jasmin that this affects:

I've tested it with the job u-cq053 that @DrYool owns and it works fine for me.

So now, once this is merged and everyone uses it, this should not longer break if someone else has already downloaded the job that you want to study.

download_from_mass will now spit out a warning saying:

WARNING: someone else ( ayool ) owns this directory, so you may not be able to download files.
WARNING: Ask them politely to run "chmod g+w  /gws/nopw/j04/ukesm/BGC_data/u-cq053/ "

Things that are uncertain:

So just as a reminder:

To test this:

git fetch
git merge remotes/origin/fix_auto_download_permissions
analysis_compare -y input/yml
ledm commented 9 months ago

Happy to wait for a test. And maybe even @DrYool can give a thumbs up too?

valeriupredoi commented 9 months ago

sounds good, bud! Lemme hatch up a test then - I reckon it'll be a while for @DrYool to test since I believe he's not in next week, so we may as well merge after the test gets in :beer:

valeriupredoi commented 8 months ago

OK bud @ledm test in, ready to merge this from my side - you merge it whenever you reckon, before or after Andy ran a test, up to you :+1:

ledm commented 8 months ago

Any news from @DrYool on whether this works for him?

ledm commented 8 months ago

I guess it'll be hard until ol' Jassy is back online. Hopefully we'll know later this week?

valeriupredoi commented 8 months ago

theoretically/planned come back of Jasmine is le 7 du Novembre but let's see if that means actual functionality, data availability etc. We'll give a friendly ping to @DrYool then, I reckon - prob best to merge this now, methinks, so Andy can just grab the latest main? But no rush at all :beer:

ledm commented 8 months ago

If you say so!

DrYool commented 8 months ago

Hi Lee and V. OK, so I've done a git fetch, etc., and appear to have updated my code. However, I still get this sort of error when I try this ...

Traceback (most recent call last):
  File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
    load_yml_and_run(compare_yml, config_user, skip_timeseries)
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 589, in load_yml_and_run
    download_from_mass(jobID, doMoo=do_mass_download, auto_download=auto_download[jobID], config_user=config_user)
  File "/home/users/ayool/bgcval2/bgcval2/download_from_mass.py", line 418, in download_from_mass
    os.chmod(outputFold, st.st_mode | stat.S_IWGRP)
PermissionError: [Errno 13] Permission denied: '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'

Does this make any sense to you? This seems a slightly different error from before.

ledm commented 8 months ago

I really don't understand - I never understood permissions in linux.

This path is:

drwxrwsr-x 1 ldemora   gws_ukesm    0 Oct 20 07:39 u-cy837

So you have permission to write to this path, (because I gave you permission already). The command that we're running adds group permissions.

Can you not change permission on someone else's file? How do I make it so that anyone in the group can change permissions?

valeriupredoi commented 8 months ago

Lee, I think you need to change that s with an x. But I think the problem here is that Andy doesn't need to do that - that chmod call should not happen if the dir has correct group permissions set by you, and you can do this once, on JASMIN

ledm commented 8 months ago

I want to avoid doing it manually every time we add a new job. It needs to be done in python.

A problem is that os.access() can't tell the difference between user write permissions and group write permissions (probably because it's a tool that is supposed to work with both linux and windows and mac systems.)

At the moment, the logic here is:

jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'

bool = Can I write there? 
if bool: grant others permissions
if not bool: print warning

But the logic should be:

jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'

bool = Can I write there? 
bool2 = Can others write there?

if bool and bool2: do nothing
if bool and not bool2: grant others write permissions
if not bool: print warning.

The problem is, I have no way of getting bool2 set up in python, as os.access() only cares about if I can write it, not others.

DrYool commented 8 months ago

In the short-term (i.e. Jane is breathing down my neck), does this mean that the solution is for me to ask data "owners" to open up their directories manually so that I can see / write into them? One would like to think there'd be a way in Linux of creating a space where it was a free-for-all and where permissions weren't necessary at all. However, I imagine that'd have unforeseen (by me, anyway) consequences around security, etc., that Linux would be wise to dodge.

ledm commented 8 months ago

We've already done that. This directory is drwxrwsr-x.

ledm commented 8 months ago

In the short term Andrew, just comment out line 419 of bgcval2/download_from_mass.py and replace it with pass.

DrYool commented 8 months ago

This is what the relevant block looks like in "my version" of this script ...

413     # Check permissions on the output folder 
414     i_can_write_this = os.access(outputFold, os.W_OK)
415     if i_can_write_this:
416         # I created this folder and I own it.
417         # make this folder group writeable.
418         os.chmod(outputFold, st.st_mode | stat.S_IWGRP)
419     else:
420         # Someone else created it and they own it,
421         # so I can't change permissions.
422         uname = getpwuid(st.st_uid).pw_name
423         print('WARNING: someone else (', uname, ') owns this directory, so you may not be able to dow    nload files.')
424         print('WARNING: Ask them politely to run "chmod g+w ',outputFold,'"')

The offending line is actually an else statement. Should I really be commenting that out?

It occurs that my installation is now out of date, but when I perform a git fetch, nothing comes back. This contrasts with yesterday, where I got a number of messages / warnings when I did this (as I expected). I presume that, so long as I follow the updating instructions given on the git, I should be fine. Right?

DrYool commented 8 months ago

Not being familiar with pass, I Googled it and found it's an alternative argument in if ... else syntax, so I thought I'd just have a go replacing line 419's else: with first pass: and then (when that didn't work) pass to see if that fixed things. Reader, it did not.

The change to pass: gave a syntax error - something I was half-expecting. While the change to pass got Python all pissy about indentation - something that I was also half-expecting.

My suspicion is that my line 419 is not the same as your line 419. I don't know why that is, given that I've updated my code, but I'll have done something wrong no doubt.

valeriupredoi commented 8 months ago

Andy, line 418, not 419, needs commented out and replaced with pass. As for the dir - this is the whole point of a GWS - its has to be open and free for all inside it, it has to be restricted outside it, and for those that don't have a membership on said GWS, it's a bit silly how JASMIN does things - a GWS is not quite fully open for its users, and not even for admin, e.g I am the admin of esmeval and I can't remove dirs created by GWS users (even though I should be able to, to police disk usage); in this case only the user or archadmin can :man_facepalming:

valeriupredoi commented 8 months ago

git operations to get you up to scratch with the latest main are:

git fetch -v
git checkout main (although you prob are already on main)
git pull origin main
valeriupredoi commented 8 months ago

@ledm

if bool and not bool2: grant others write permissions

on JASMIN, is restricted by the polices of the GWS, unless one applies a superlax -r 777 (which is not recommended) - I should have realized this when reviewing this PR, but, alas, should work elsewhere

ledm commented 8 months ago

Hmmm.... What about if we do this instead:

jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'

bool = Can I write there? 
bool2 = Do I own this?

if bool and bool2: grant others write permissions
if bool and not bool2: do nothing
if not bool: print warning.
valeriupredoi commented 8 months ago

Hmmm.... What about if we do this instead:

jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'

bool = Can I write there? 
bool2 = Do I own this?

if bool and bool2: grant others write permissions
if bool and not bool2: do nothing
if not bool: print warning.

think that should work!

DrYool commented 8 months ago

Bit of an update: still having trouble. I'm trying with two input_yml scripts, but both are failing with different errors. The first one is input_yml/proto_ukesm2_v2.yml and is looking at runs u-ct200 and u-ct739 only ...

Adding  ./images/TotalIceExtent_regionless_layerless_Together_DataOnly.png to script
Traceback (most recent call last):
  File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
    load_yml_and_run(compare_yml, config_user, skip_timeseries)
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 613, in load_yml_and_run
    timeseries_compare(
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 440, in timeseries_compare
    comparehtml5Maker(
  File "/home/users/ayool/bgcval2/bgcval2/bgcval2_make_report.py", line 1639, in comparehtml5Maker
    assert 0
AssertionError

The second is input_yml/TerraFIRMA_en1_reversible.yml and is still failing with a permissions error ...

writing file in shared path /gws/nopw/j04/esmeval/bgcval2/shared_mass_scripts/u-cy837.sh
Traceback (most recent call last):
  File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
    load_yml_and_run(compare_yml, config_user, skip_timeseries)
  File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 589, in load_yml_and_run
    download_from_mass(jobID, doMoo=do_mass_download, auto_download=auto_download[jobID], config_user=config_user)
  File "/home/users/ayool/bgcval2/bgcval2/download_from_mass.py", line 508, in download_from_mass
    shutil.copy(download_script_path, shared_file_path)
  File "/home/users/ayool/miniconda3/envs/bgcval2/lib/python3.11/shutil.py", line 419, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/home/users/ayool/miniconda3/envs/bgcval2/lib/python3.11/shutil.py", line 258, in copyfile
    with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied

The error is strange, however, as the file it's pointing at (/gws/nopw/j04/esmeval/bgcval2/shared_mass_scripts/u-cy837.sh) seems to have the correct permissions, but also seems to be empty. So that's a bit weird.

ledm commented 8 months ago

Hi Andrew, the first one is failing for a strange reason, but you might be okay if you just comment out the assert 0 command on that line.

The second one is related to the issue we're trying to address with your previous problem.

ledm commented 8 months ago

u-ct200 is owned by Till! This is getting confusing!

ledm commented 8 months ago

@DrYool, can you do a:

git fetch
git merge origin/fix_auto_download_permissions

and try input_yml/proto_ukesm2_v2.yml again please?

DrYool commented 8 months ago

I did the git fetch, etc., but it reported my installation was up to date. And when I run input_yml/proto_ukesm2_v2.yml (without making the correction you mentioned above) I just get the same error. I'll have a look in a bit once my VC is over.

ledm commented 8 months ago

Sorry, I pushed it to the wrong branch. Please try again!

DrYool commented 8 months ago

Will do in a minute - I'm just waiting to update Jane on BGCVal2 in our VC ... 😬

ledm commented 8 months ago

I should also say that I already have a bgcval2 suite that's very similar to your second one. It's here: https://gws-access.jasmin.ac.uk/public/esmeval/CompareReports/bgcval2/ldemora/TerraFIRMA_overshoot_runs

It's a particularly expensive one to run with dozens of jobs, so might not be great to be running it twice! Also my colour scheme follows Robins's plots!

DrYool commented 8 months ago

Hi Andrew, the first one is failing for a strange reason, but you might be okay if you just comment out the assert 0 command on that line.

The second one is related to the issue we're trying to address with your previous problem.

Commented out the append 0 and got to the end!

-------------
Success
test with:
firefox CompareReports2/proto_ukesm2_v2/index.html
End of timeseries_compare
Finished... 

Admittedly, this is a job I'd previously truncated the experiment list of to get a result. I'll revisit it and see if the version I really wanted worked.

I'll update once I've set-up some new jobs, and we can maybe kill this issue if they proceed to plan. What do I need to do (if anything) about "validating" the fixes? And should anything be done about the append 0 issue? I can see from the surrounding code that this is a minor backwater already.

DrYool commented 8 months ago

I should also say that I already have a bgcval2 suite that's very similar to your second one. It's here: https://gws-access.jasmin.ac.uk/public/esmeval/CompareReports/bgcval2/ldemora/TerraFIRMA_overshoot_runs

It's a particularly expensive one to run with dozens of jobs, so might not be great to be running it twice! Also my colour scheme follows Robins's plots!

Yes, it's a biggie, and I probably don't need to run it again. But I'd like to do this ahead of tinkering with it. Also, the experience of the last couple of months is that I need to continually use BGCVal2 so that I don't forget what I'm doing with it! I got quite good early this year, but then forgot everything (it seems!). The lesson is, perhaps, don't get old. 🤣

valeriupredoi commented 8 months ago

good stuff @DrYool :partying_face: Yes, assert 0 should not find its way in modern Python code, right, @ledm - right? :grin: I don't know about validation of results, Lee can prob tell you more tho :+1: Gonna go review the PR and merge it then, good work, fellas!