Open stacimc opened 2 years ago
File size and file type can be backfilled together with the image dimensions data. Here's the information on the file size and file type information:
Provider | file type in the script | file size in the script | backfill for file type | backfill for file size |
---|---|---|---|---|
Smithsonian | needs to be added | needs to be added | - | - |
Raw Pixel | needs to be added | needs to be added | - | - |
Finnish Museums | needs to be added | needs to be added | - | - |
NYPL | added in WordPress/openverse-catalog#630 | needs to be added | not run yet | - |
Phylopic | added in WordPress/openverse-catalog#547 | needs to be added | not run yet | - |
Metropolitan Museum of Art | added in WordPress/openverse-catalog#568 | needs to be added | not run yet | - |
Cleveland Museum of Art | added in WordPress/openverse-catalog#537 | added in WordPress/openverse-catalog#537 | not run yet | not run yet |
Museums Victoria | added in WordPress/openverse-catalog#600 | needs to be added | not run yet | - |
SMK | added in WordPress/openverse-catalog#542 | added in WordPress/openverse-catalog#542 | not run yet | not run yet |
Science Museum | added in WordPress/openverse-catalog#576 | needs to be added | not run yet | - |
Walters Art Museum | cannot fix due to WordPress/openverse#1637 | - | - | - |
Brooklyn Museum | cannot fix due to WordPress/openverse#1638 | - | - | - |
Europeana | needs to be added in fixing WordPress/openverse#1727 | - | - | - |
Problem
Depends on WordPress/openverse#1486
Once we've added image dimensions detection for the providers that don't currently support them, we'll need to backfill the data for previously ingested records. The providers to backfill are:
* Since Metropolitan and Europeana are dated DAGs, we could potentially rely on their reingestion workflow to backfill the data over time (related: WordPress/openverse#1501).
Implementation