Closed ledm closed 8 months ago
Happy to wait for a test. And maybe even @DrYool can give a thumbs up too?
sounds good, bud! Lemme hatch up a test then - I reckon it'll be a while for @DrYool to test since I believe he's not in next week, so we may as well merge after the test gets in :beer:
OK bud @ledm test in, ready to merge this from my side - you merge it whenever you reckon, before or after Andy ran a test, up to you :+1:
Any news from @DrYool on whether this works for him?
I guess it'll be hard until ol' Jassy is back online. Hopefully we'll know later this week?
theoretically/planned come back of Jasmine is le 7 du Novembre but let's see if that means actual functionality, data availability etc. We'll give a friendly ping to @DrYool then, I reckon - prob best to merge this now, methinks, so Andy can just grab the latest main
? But no rush at all :beer:
If you say so!
Hi Lee and V. OK, so I've done a git fetch
, etc., and appear to have updated my code. However, I still get this sort of error when I try this ...
Traceback (most recent call last):
File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
load_yml_and_run(compare_yml, config_user, skip_timeseries)
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 589, in load_yml_and_run
download_from_mass(jobID, doMoo=do_mass_download, auto_download=auto_download[jobID], config_user=config_user)
File "/home/users/ayool/bgcval2/bgcval2/download_from_mass.py", line 418, in download_from_mass
os.chmod(outputFold, st.st_mode | stat.S_IWGRP)
PermissionError: [Errno 13] Permission denied: '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'
Does this make any sense to you? This seems a slightly different error from before.
I really don't understand - I never understood permissions in linux.
This path is:
drwxrwsr-x 1 ldemora gws_ukesm 0 Oct 20 07:39 u-cy837
So you have permission to write to this path, (because I gave you permission already). The command that we're running adds group permissions.
Can you not change permission on someone else's file? How do I make it so that anyone in the group can change permissions?
Lee, I think you need to change that s
with an x
. But I think the problem here is that Andy doesn't need to do that - that chmod call should not happen if the dir has correct group permissions set by you, and you can do this once, on JASMIN
I want to avoid doing it manually every time we add a new job. It needs to be done in python.
A problem is that os.access()
can't tell the difference between user write permissions and group write permissions (probably because it's a tool that is supposed to work with both linux and windows and mac systems.)
At the moment, the logic here is:
jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'
bool = Can I write there?
if bool: grant others permissions
if not bool: print warning
But the logic should be:
jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'
bool = Can I write there?
bool2 = Can others write there?
if bool and bool2: do nothing
if bool and not bool2: grant others write permissions
if not bool: print warning.
The problem is, I have no way of getting bool2 set up in python, as os.access() only cares about if I can write it, not others.
In the short-term (i.e. Jane is breathing down my neck), does this mean that the solution is for me to ask data "owners" to open up their directories manually so that I can see / write into them? One would like to think there'd be a way in Linux of creating a space where it was a free-for-all and where permissions weren't necessary at all. However, I imagine that'd have unforeseen (by me, anyway) consequences around security, etc., that Linux would be wise to dodge.
We've already done that. This directory is drwxrwsr-x
.
In the short term Andrew, just comment out line 419 of bgcval2/download_from_mass.py
and replace it with pass
.
This is what the relevant block looks like in "my version" of this script ...
413 # Check permissions on the output folder
414 i_can_write_this = os.access(outputFold, os.W_OK)
415 if i_can_write_this:
416 # I created this folder and I own it.
417 # make this folder group writeable.
418 os.chmod(outputFold, st.st_mode | stat.S_IWGRP)
419 else:
420 # Someone else created it and they own it,
421 # so I can't change permissions.
422 uname = getpwuid(st.st_uid).pw_name
423 print('WARNING: someone else (', uname, ') owns this directory, so you may not be able to dow nload files.')
424 print('WARNING: Ask them politely to run "chmod g+w ',outputFold,'"')
The offending line is actually an else
statement. Should I really be commenting that out?
It occurs that my installation is now out of date, but when I perform a git fetch
, nothing comes back. This contrasts with yesterday, where I got a number of messages / warnings when I did this (as I expected). I presume that, so long as I follow the updating instructions given on the git, I should be fine. Right?
Not being familiar with pass
, I Googled it and found it's an alternative argument in if ... else
syntax, so I thought I'd just have a go replacing line 419's else:
with first pass:
and then (when that didn't work) pass
to see if that fixed things. Reader, it did not.
The change to pass:
gave a syntax error - something I was half-expecting. While the change to pass
got Python all pissy about indentation - something that I was also half-expecting.
My suspicion is that my line 419 is not the same as your line 419. I don't know why that is, given that I've updated my code, but I'll have done something wrong no doubt.
Andy, line 418, not 419, needs commented out and replaced with pass
. As for the dir - this is the whole point of a GWS - its has to be open and free for all inside it, it has to be restricted outside it, and for those that don't have a membership on said GWS, it's a bit silly how JASMIN does things - a GWS is not quite fully open for its users, and not even for admin, e.g I am the admin of esmeval
and I can't remove dirs created by GWS users (even though I should be able to, to police disk usage); in this case only the user or archadmin can :man_facepalming:
git operations to get you up to scratch with the latest main
are:
git fetch -v
git checkout main (although you prob are already on main)
git pull origin main
@ledm
if bool and not bool2: grant others write permissions
on JASMIN, is restricted by the polices of the GWS, unless one applies a superlax -r 777
(which is not recommended) - I should have realized this when reviewing this PR, but, alas, should work elsewhere
Hmmm.... What about if we do this instead:
jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/'
bool = Can I write there?
bool2 = Do I own this?
if bool and bool2: grant others write permissions
if bool and not bool2: do nothing
if not bool: print warning.
Hmmm.... What about if we do this instead:
jobId_download_dir = '/gws/nopw/j04/ukesm/BGC_data/u-cy837/' bool = Can I write there? bool2 = Do I own this? if bool and bool2: grant others write permissions if bool and not bool2: do nothing if not bool: print warning.
think that should work!
Bit of an update: still having trouble. I'm trying with two input_yml
scripts, but both are failing with different errors. The first one is input_yml/proto_ukesm2_v2.yml
and is looking at runs u-ct200 and u-ct739 only ...
Adding ./images/TotalIceExtent_regionless_layerless_Together_DataOnly.png to script
Traceback (most recent call last):
File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
load_yml_and_run(compare_yml, config_user, skip_timeseries)
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 613, in load_yml_and_run
timeseries_compare(
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 440, in timeseries_compare
comparehtml5Maker(
File "/home/users/ayool/bgcval2/bgcval2/bgcval2_make_report.py", line 1639, in comparehtml5Maker
assert 0
AssertionError
The second is input_yml/TerraFIRMA_en1_reversible.yml
and is still failing with a permissions error ...
writing file in shared path /gws/nopw/j04/esmeval/bgcval2/shared_mass_scripts/u-cy837.sh
Traceback (most recent call last):
File "/home/users/ayool/miniconda3/envs/bgcval2/bin/analysis_compare", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 677, in main
load_yml_and_run(compare_yml, config_user, skip_timeseries)
File "/home/users/ayool/bgcval2/bgcval2/analysis_compare.py", line 589, in load_yml_and_run
download_from_mass(jobID, doMoo=do_mass_download, auto_download=auto_download[jobID], config_user=config_user)
File "/home/users/ayool/bgcval2/bgcval2/download_from_mass.py", line 508, in download_from_mass
shutil.copy(download_script_path, shared_file_path)
File "/home/users/ayool/miniconda3/envs/bgcval2/lib/python3.11/shutil.py", line 419, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/users/ayool/miniconda3/envs/bgcval2/lib/python3.11/shutil.py", line 258, in copyfile
with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied
The error is strange, however, as the file it's pointing at (/gws/nopw/j04/esmeval/bgcval2/shared_mass_scripts/u-cy837.sh
) seems to have the correct permissions, but also seems to be empty. So that's a bit weird.
Hi Andrew, the first one is failing for a strange reason, but you might be okay if you just comment out the assert 0
command on that line.
The second one is related to the issue we're trying to address with your previous problem.
u-ct200
is owned by Till! This is getting confusing!
@DrYool, can you do a:
git fetch
git merge origin/fix_auto_download_permissions
and try input_yml/proto_ukesm2_v2.yml
again please?
I did the git fetch
, etc., but it reported my installation was up to date. And when I run input_yml/proto_ukesm2_v2.yml
(without making the correction you mentioned above) I just get the same error. I'll have a look in a bit once my VC is over.
Sorry, I pushed it to the wrong branch. Please try again!
Will do in a minute - I'm just waiting to update Jane on BGCVal2 in our VC ... 😬
I should also say that I already have a bgcval2 suite that's very similar to your second one. It's here: https://gws-access.jasmin.ac.uk/public/esmeval/CompareReports/bgcval2/ldemora/TerraFIRMA_overshoot_runs
It's a particularly expensive one to run with dozens of jobs, so might not be great to be running it twice! Also my colour scheme follows Robins's plots!
Hi Andrew, the first one is failing for a strange reason, but you might be okay if you just comment out the
assert 0
command on that line.The second one is related to the issue we're trying to address with your previous problem.
Commented out the append 0
and got to the end!
-------------
Success
test with:
firefox CompareReports2/proto_ukesm2_v2/index.html
End of timeseries_compare
Finished...
Admittedly, this is a job I'd previously truncated the experiment list of to get a result. I'll revisit it and see if the version I really wanted worked.
I'll update once I've set-up some new jobs, and we can maybe kill this issue if they proceed to plan. What do I need to do (if anything) about "validating" the fixes? And should anything be done about the append 0
issue? I can see from the surrounding code that this is a minor backwater already.
I should also say that I already have a bgcval2 suite that's very similar to your second one. It's here: https://gws-access.jasmin.ac.uk/public/esmeval/CompareReports/bgcval2/ldemora/TerraFIRMA_overshoot_runs
It's a particularly expensive one to run with dozens of jobs, so might not be great to be running it twice! Also my colour scheme follows Robins's plots!
Yes, it's a biggie, and I probably don't need to run it again. But I'd like to do this ahead of tinkering with it. Also, the experience of the last couple of months is that I need to continually use BGCVal2 so that I don't forget what I'm doing with it! I got quite good early this year, but then forgot everything (it seems!). The lesson is, perhaps, don't get old. 🤣
good stuff @DrYool :partying_face: Yes, assert 0
should not find its way in modern Python code, right, @ledm - right? :grin: I don't know about validation of results, Lee can prob tell you more tho :+1: Gonna go review the PR and merge it then, good work, fellas!
closes 111
This is for @DrYool.
Now bgcval2 checks whether you have write permission before attempting to overwrite the download script or change the download directory permissions.
There are two places on jasmin that this affects:
/gws/nopw/j04/ukesm/BGC_data/
)/gws/nopw/j04/esmeval/bgcval2/shared_mass_scripts/
)I've tested it with the job
u-cq053
that @DrYool owns and it works fine for me.So now, once this is merged and everyone uses it, this should not longer break if someone else has already downloaded the job that you want to study.
download_from_mass
will now spit out a warning saying:Things that are uncertain:
So just as a reminder:
download_from_mass -j u-aa111 jobID2 jobID3
To test this: