njcuk9999 / apero-drs

A PipelinE to Reduce Observations - The DRS for SPIRou (CFHT)
MIT License
12 stars 1 forks source link

have an option to reduce the volume of processed files #664

Closed clairem789 closed 3 years ago

clairem789 commented 3 years ago

We may have it already - sorry if I missed it! I think we now need a few options to limit the number and volume of processed files. I don't know if all these options are possible but they all make sense to me: Option 1: create all files Option 2: create and/or keep only CADC files Option 3: create and/or keep only a few files (remove all DEBUG, remove npy, remove some duplicate spectra like _w and _v) Thanks!

njcuk9999 commented 3 years ago

Sorry I didn't answer this sooner, this is not possible in the 0.6 due to some files still being required in the reduced directory (though you can manually remove files as suggested below - if you know they aren't to be used by any other process)

But in 0.7 the reduced directory can be completely or partially deleted as you wish (as we should only require a complete calibDB and telluDB - after that nights reduction - note that some files

The simplest way would just be to run an rm manually (full control + quickest)

>> cd reduced
>> rm -v */_s1d_w*.fits
>> rm -v */*.npy
>> rm -v */DEBUG*.fits

Note that some npy files are used to speed up calculations that have already been done (and haven't changed with the current reduction) - you can always delete these but some processes may be slower as these calculations will have to be re-done.

Note there already exists an apero_reset.py tool that will delete everything in the reduced directory if requested - but currently it doesn't deistinguish between file types and remove files in python is much slower than using rm.

I do have a cleaning option in the "obj_postprocess_spirou.py" code (the code that builds the CADC outputs) - but it will only remove files that have been pushed into CADC output form (so not the debug files) and it is done on an object by object basis.

All that being said I could write a tool that basically does all the rm for you - I guess you'd have to be asked which files you don't want or supply a list of suffixes etc (but it will be slower than using rm)

njcuk9999 commented 3 years ago

Okay having looked more into this there will be options to turn on / off debug outputs (the DEBUG_background files were taking up a lot of space)

In 0.7 there will be the follow constants added to the user_constants.ini

njcuk9999 commented 3 years ago

So you'll have this option in 0.7 - just need to edit the user_config.ini and in theory once the CADC outputs are made you could delete the whole reduced directory at a push!