RickKessler / SNANA

Supernova Analysis package
38 stars 23 forks source link

Diagnosing issue causing permission errors #1445

Open mattgrayling opened 1 week ago

mattgrayling commented 1 week ago

I've been running into some very unusual permission errors when simulating with SNANA on Perlmutter, specifically arising during the merge process. We've been speaking with NERSC about what the cause might be, we haven't been able to diagnose what is going on thus far. We've added the output of os.stat to the logs so far, they've requested that we amend the code with the following suggestions to get some more information:

  1. Can you check that the file is indeed closed before running the operation that triggers the warning? I expect this would look like:

if file.closed == True: print("file is closed") else: print("file is open")

  1. Instead of os.stat(merge_file) could you do os.access(merge_file)?

  2. Use a shell command to print out the file. For example something like:

import subprocess

subprocess.run(["cat",merge_file])

Could you add these to the relevant file (I think it's read_merge_file in submit_util.py, but not 100%)?

RickKessler commented 1 week ago

I added more diagnostics;

Screenshot 2024-11-18 at 3 14 35 PM

Note that f.closed won't work because there is no file pointer.