When loading LiPD files, pyLipd (as did the LiPD utilities before it) currently saddles the lipd object with a bunch of utterly useless entries. For instance, when running this snippet:
We see that approximately half of the entries are "year", which means that the corresponding series is basically "x = year, y = year", which only contributes to confuse users and burden the RAM. There needs to be a way to banish these things.
Either:
implement a rule in get_timeseries_essentials() that if year, age is in lower(paleoData_variableName) then we chuck the series. At the very least, it could be done through a boolean flag set to False by default if you are worried about chucking potentially valuable series.
implement a function called paleoData_cleanup() that filters either the lipd object or the dataframe obtained from get_timeseries_essentials()
I would view it as one of the main improvements of pyLipd over its predecessor if it alleviated the need to manually remove these garbage series.
When loading LiPD files, pyLipd (as did the LiPD utilities before it) currently saddles the lipd object with a bunch of utterly useless entries. For instance, when running this snippet:
We see that approximately half of the entries are "year", which means that the corresponding series is basically "x = year, y = year", which only contributes to confuse users and burden the RAM. There needs to be a way to banish these things. Either:
get_timeseries_essentials()
that ifyear, age is in lower(paleoData_variableName)
then we chuck the series. At the very least, it could be done through a boolean flag set to False by default if you are worried about chucking potentially valuable series.paleoData_cleanup()
that filters either the lipd object or the dataframe obtained fromget_timeseries_essentials()
I would view it as one of the main improvements of pyLipd over its predecessor if it alleviated the need to manually remove these garbage series.