issues
search
qurator-spk
/
mods4pandas
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
Apache License 2.0
11
stars
0
forks
source link
Move on to Parquet format
#36
Closed
mikegerber
closed
3 months ago
mikegerber
commented
4 months ago
33 and related problems look like we finally move to the Parquet format - sticking to an old Pandas version to produce a "stable pickle" does not seem maintainable anymore.
[x] Review "set" columns like
classification-ZVDD
[x] Review dtypes
mikegerber
commented
3 months ago
Set columns become "arrays" now, due to conversion to Parquet and back
dtypes are as before
→ Defering improvement on this to #37
33 and related problems look like we finally move to the Parquet format - sticking to an old Pandas version to produce a "stable pickle" does not seem maintainable anymore.
classification-ZVDD