silknow / crawler

SILKNOW crawler that collects metadata records describing silk material from various museums
Apache License 2.0
2 stars 1 forks source link

Logbook has been updated with Addendum for paper-based materials #29

Closed rtroncy closed 3 years ago

rtroncy commented 3 years ago

The Logbook for Harvesting Data from Museums has been updated along the time.

@ehrhart There is a need to review the museum sources which have been crawled already as the search strategies have been enhanced. Pay attention, in particular, to the Addendum added for a large number of sources.

rtroncy commented 3 years ago

The addendum is particularly relevant for some museums, such as CERES, see https://docs.google.com/document/d/1Tg4gDlz8HjILPRQ8XBOBFo-3-74nmJdJ31YL6PUtdB4/edit#heading=h.sfeb7xefe1rk as well as this email and this email.

ehrhart commented 3 years ago

@tschleider

CERES Museum has been updated

Records: ceres-mcu_records_20210503_4.tar.gz Images: ceres-mcu_files_20210503_4.tar.gz

Note: the fields that come from museosdeandalucia are slightly different than the others, since it's a different website layout and data structure. Please refer to the updated list of known fields for some examples.

ARTIC Museum has been updated

Records: artic_records_20210503_2.tar.gz Images: artic_files_20210503_2.tar.gz

rtroncy commented 3 years ago

@ehrhart Can this issue be closed? What is left to do?

ehrhart commented 3 years ago

This issue can indeed be closed since all the changes have been implemented.