Open teuben opened 5 years ago
I agree with the general principle that information on the source of the data should be included, but would prefer to avoid writing separate provenance logic for every source class. The base class for sources is the SPHSource, which accepts arrays of particle properties as input, and therefore has no idea where the data is coming from.
I can see a little bit of room for improvement by recording which modules were provided to martini, and to the extent possible the arguments for each module, but copying the full particle arrays seems like overkill.
I could probably also add a function allowing the user to fill some information into the FITS header, if that would be useful.
With this in mind, I'd be happy to know if you have any concrete suggestions.
I will ponder this yes.
I now have a proposal, which will be in a PR shortly. Essentially each class will publish a self.history[], which is just a list of strings. In FITS you have to try and keep to within 72 (or so) characters, although there are ways to make it longer.
In my example the tail end of the fits file looks as follows: (and I'm leaving out parameters we could pass along in the strings):
HISTORY MARTINI HISTORY --prune_source HISTORY --insert_source_in_cube HISTORY --add_noise HISTORY --convolve_beam HISTORY SPHSource HISTORY DataCube HISTORY BaseBeam HISTORY GaussianNoise HISTORY WendlandC2Kernel HISTORY GaussianSpectrum
I think that this still makes sense to do, and will plan to put something together (probably not using the proposed PR, which is very stale, directly but borrowing from it) at the same time as I look at #23
It would be nice to have an idea of the processing history (e.g. original filenames) etc. such that in the HISTORY or COMMENT section of the fits file, one can read how the cube was made.