Closed balerion closed 3 years ago
this has been fixed in this commit in the hackathon branch.
now metadata is correctly retained when appending parquets. The json metadata file for the appended parquets contains a dictionary with an aggregate of the runInfo of all included runs, to represent correctly the number of electrons/macrobunches, run numbers etc. The metadata of each run is stored next to this as dictionaries with the same structure as before.
Should we close this issue if this has been fixed?
yes, should be fixed!
If there is any problem with metadata handling, we should make a new issue
The metadata gets overwritten by the last interval when storing dataframes in append mode. In other words, the data gets appended, but the metadata is only aware of the last append.