LinkedEarth / pylipd

Development repository for Python LiPD utilities
https://pylipd.readthedocs.io/en/latest/
Apache License 2.0
2 stars 0 forks source link

`get_timeseries` no longer has archiveType #65

Closed khider closed 1 month ago

khider commented 1 month ago

To reproduce:

from pylipd.lipd import LiPD

        lipd = LiPD()        
        lipd.load(["https://lipdverse.org/data/LCf20b99dfe8d78840ca60dfb1f832b9ec/1_0_1//Nunalleq.Ledger.2018.lpd"])

        ts_list = lipd.get_timeseries(lipd.get_all_dataset_names())

        for dsname, tsos in ts_list.items():
            for tso in tsos:
                if 'paleoData_variableName' in tso:
                    print(dsname+': '+tso['paleoData_variableName']+': '+tso['archiveType'])
varunratnakar commented 1 month ago

The archiveType is only assigned to the dataset, and not to each variable in the new LiPD. The semantics here are confusing. Can you clarify if this needs to be transferred to each variable ? If so, is this only for the purpose of this function, or would you be expecting to do sparql queries over archiveType of each variable ?

varunratnakar commented 1 month ago

I've added it to the get_timeseries function, so each variable will have that information (though it wouldn't be stored at the variable level in the graph)

https://github.com/LinkedEarth/pylipd/commit/c64b04129ed60abb985abc45c8bae622a0e1b769

khider commented 1 month ago

Yes, I was about to say that I don't envision doing a SPARQL query on it but I could see a filtering on the data frame after the fact.