radical-cybertools / radical.pilot

RADICAL-Pilot
http://radical-cybertools.github.io/radical-pilot/index.html
Other
54 stars 23 forks source link

Update Amarel Resource Config #2942

Closed AymenFJA closed 1 year ago

AymenFJA commented 1 year ago

updating Amarel resource config

codecov[bot] commented 1 year ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 41.45%. Comparing base (feddbf9) to head (2f9446c). Report is 2631 commits behind head on devel.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## devel #2942 +/- ## ======================================= Coverage 41.45% 41.45% ======================================= Files 95 95 Lines 10481 10481 ======================================= Hits 4345 4345 Misses 6136 6136 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

AymenFJA commented 1 year ago

LGTM! can you please make a test run for RP and feel free to merge! Thank you!

@mtitov the test run is failing (rp.session.hal0271.amarel.rutgers.edu.afa64.019499.0003.log) with:

1684787757.212 : rp.session.hal0271.amarel.rutgers.edu.afa64.019499.0003 : 38020 : 140123217000256 : DEBUG    : tar result:
---
tar (child): bzip2: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

---
2
1684787757.444 : rp.session.hal0271.amarel.rutgers.edu.afa64.019499.0003 : 38020 : 140123217000256 : ERROR    : failed to fetch pilot.0000 for profiles
Traceback (most recent call last):
  File "/cache/home/afa64/ve/rct/lib/python3.9/site-packages/radical/pilot/utils/session.py", line 215, in fetch_filetype
    raise RuntimeError('could not create tarball: %s' %
RuntimeError: could not create tarball: tar (child): bzip2: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

same error in radical.saga.pty.log:

1684787756.515 : radical.saga.pty     : 38020 : 140123217000256 : DEBUG    : read : [    4] [  501] (\nSTATE : FAILED\n\nECODE=2\nSTART  : 1684787756\nSTOP   : 1684787756\n\nSTART_STDERR\ntar (child): bzip2: Cannot exec: No such file or directory\ntar (child): Error is not recoverable: exiting now\ntar: Child returned status 2\ntar: Error is not recoverable: exiting now\nEND_STDERR\n\nSTART_STDOUT\ntar (child): bzip2: Cannot exec: No such file or directory\ntar (child): Error is not recoverable: exiting now\ntar: Child returned status 2\ntar: Error is not recoverable: exiting now\nEND_STDOUT\n\n)

radical-stack:

(rct) [afa64@hal0271 rp.session.hal0271.amarel.rutgers.edu.afa64.019499.0003]$ radical-stack

  python               : /cache/home/afa64/ve/rct/bin/python3
  pythonpath           :
  version              : 3.9.5
  virtualenv           : /cache/home/afa64/ve/rct

  radical.entk         : 1.30.0
  radical.gtod         : 1.20.1
  radical.pilot        : 1.34.0-v1.33.0-625-gbf3a4e2@update-amarel_config
  radical.saga         : 1.33.0
  radical.utils        : 1.33.0
mtitov commented 1 year ago

@AymenFJA can you please run without removing original line but adding a new one

"module load py-data-science-stack/5.1.0-kp807",
"module load python/3.9.6-gc563",
AymenFJA commented 1 year ago

@AymenFJA can you please run without removing original line but adding a new one

"module load py-data-science-stack/5.1.0-kp807",
"module load python/3.9.6-gc563",

@mtitov same issue with your suggestions.

mtitov commented 1 year ago

Just to keep track of the next actions - need to confirm that with the default settings for the Session.close() method (no need to set download=True) it is not an issue.

AymenFJA commented 1 year ago

@andre-merzky @mtitov env.txt