Open labsaha opened 4 years ago
Hi @labsaha could you copy here to which error you are referring?
File "lstchain_data_r0_to_dl1.py", line 73, in main
pointing_file_path=args.pointing_file_path
File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/reco/dl0_to_dl1.py", line 181, in r0_to_dl1
write_metadata(metadata, output_filename)
File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/io/io.py", line 454, in write_metadata
attribute = file.root._v_attrs[k].extend(metadata[k])
AttributeError: 'NoneType' object has no attribute 'extend'
File "lstchain_data_r0_to_dl1.py", line 73, in main pointing_file_path=args.pointing_file_path File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/reco/dl0_to_dl1.py", line 181, in r0_to_dl1 write_metadata(metadata, output_filename) File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/io/io.py", line 454, in write_metadata attribute = file.root._v_attrs[k].extend(metadata[k]) AttributeError: 'NoneType' object has no attribute 'extend'
So you can see that it was not complaining about the presence of the DL1 file. When we deleted the DL1 files, we didn't get this error.
Hi @labsaha It seems from @andres-baquero comment that the file was corrupted but I don't understand the context.
Could you provide more information, please? What command did you run?
I do not think it is corrupted, this happens when you try to overwrite a file, no matter whether the existing one is corrupted or not, right @labsaha, @andres-baquero ?
Ok, Let me explain a bit in detail. While producing the DL1 files using the standard command associated with the script dl0_to_dl1.py, the code failed due to the pointing issue we are discussing. However, it produced a DL1 file which is not the full DL1 file. When we fixed the pointing code and run the script it shows this error. This should clearly say that if we want to overwrite or by default it should overwrite without complaining. I hope I made it clear. If not, let me know.
We ran like this:
python lstchain_data_r0_to_dl1.py -f /fefs/aswg/data/real/R0/20200118/LST-1.1.Run01832.0003.fits.fz -o /fefs/aswg/data-test/real/dl1/20200118 -pedestal /fefs/aswg/data/real/calibration/20200118/v01/drs4_pedestal.Run1830.0000.fits -calib /fefs/aswg/data/real/calibration/20200118/v01/calibration.Run1831.0000.hdf5 -time_calib /fefs/aswg/data/real/calibration/20200118/v01/time_calibration.Run1831.0000.hdf5 -pointing /fefs/home/lapp/DrivePositioning/drive_log_20_01_20.txt
Some more information: If I run the script again without deleting the DL1 file which is not corrupted then we get this error message:
File "lstchain_data_r0_to_dl1.py", line 78, in <module>
main()
File "lstchain_data_r0_to_dl1.py", line 73, in main
pointing_file_path=args.pointing_file_path
File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/reco/dl0_to_dl1.py", line 215, in r0_to_dl1
write_array_info(event, output_filename)
File "/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/io/io.py", line 350, in write_array_info
append=True
File "/fefs/aswg/software/virtual_env/anaconda3/envs/cta/lib/python3.7/site-packages/astropy/table/connect.py", line 114, in __call__
registry.write(instance, *args, **kwargs)
File "/fefs/aswg/software/virtual_env/anaconda3/envs/cta/lib/python3.7/site-packages/astropy/io/registry.py", line 566, in write
writer(data, *args, **kwargs)
File "/fefs/aswg/software/virtual_env/anaconda3/envs/cta/lib/python3.7/site-packages/astropy/io/misc/hdf5.py", line 308, in write_table_hdf5
compatibility_mode=compatibility_mode)
File "/fefs/aswg/software/virtual_env/anaconda3/envs/cta/lib/python3.7/site-packages/astropy/io/misc/hdf5.py", line 323, in write_table_hdf5
raise OSError("Table {0} already exists".format(path))
OSError: Table /instrument/subarray/layout already exists
I understand now, thanks. Well, partial production or writing of a file is not supported at the moment.
In my scripts, I add an "overwrite" option, so that the user can choose to either overwrite, or fail in the case a file is there already:
output_path = Path(expandvars(self.output_filename)).expanduser()
if output_path.exists() and self.overwrite:
self.log.warning(f"Overwriting {output_path}")
output_path.unlink()
Apart from this, I am also thinking about assigning a code number associated with errors. It will allow us to know quickly what is the problem. For example, if a run finishes without an error, it should return 0
. If it finishes with an error it returns with something like xx
or so on. It will help us to debug quickly when the data reduction pipeline runs automatically.
@kosack @vuillaut @rlopezcoto What do you think on this? Or the same implementation can be taken care differently. Any idea or solution?
I'd suggest (anticipating the upcoming CTA Programming Standards Doc) that you use the same return codes as in the Linux errorno.h
header file (0=success, positive are defined errors), and then use negative values for any custom statuses. That way, you are at least somewhat consistent. However, you should still catch exceptions and issue a nice error message.
Note that if you use the ctapipe.core.Tool
, you get some standard behavior when a tool finishes normally, is interrupted (by ctrl-C for example), or catches an exception somewhere (though right now it doesn't use return codes, but it should!)
Thank you @kosack. I think what you mention here will be covered by ctapipe DL1 writer, right?
Oh, I should say also that the Provenence information in ctapipe tells you something about the status of output files as well. E.g. if I run a Tool and interrupt it, I get a provenance entry like:
...
"status": "interrupted",
"duration_min": 0.09620000000007067
}
That's from the DL1 writer (and anything inheriting from Tool)
I think what you mention here will be covered by ctapipe DL1 writer, right?
Yes
I think what you mention here will be covered by ctapipe DL1 writer, right?
Yes
Then I would personally suggest making a quick fix concerning the current issue but not throwing a lot of effort in something that will be replaced in the very near future.
@kosack @vuillaut Thanks!. Yes, it will be very good if we go for a quick fix on this issue.
is anybody working on this fix within lstchain? @labsaha, @vuillaut? One quick solution could be that proposed by @yrenier, in which you check whether the file exists and throw an error if it does:
if os.path.exists(output_filename):
raise AttributeError(output_filename + ' exists, exiting.')
I am not working on this and cannot work on this in the future. Yes, as a quick solution what @yrenier suggested can be considered.
Yesterday while trying to run
dl0-dl1
scripts, we found a strange error. Finally, we could manage to understand that it was due to the presence of the partially generated DL1 file. When we deleted the file, we could run the script without any problem. In such cases, the error message should be very transparent which is not the case for the present version of the code. This is also required for ourLSTOSA
(LST-On-Site Analysis) development. Depending upon the error, we can decide to run the code again with associated steps to correct it automatically or raise some errors which require to be handled manually.