-
## Suggested Improvement
Refactor the Metropolitan museum provider script to use the new ProviderDataIngester base class
## Benefit
More details in WordPress/openverse-catalog#229
## Imp…
-
This could be harder than it looks, but we might want to have "normalization" mapping from DBPedia categories to more commonly used NE categories. For now on some output data, we see these categories.…
-
It would be helpful to have a de-duplication utility that would read the `data.csv` and look for possible duplicate entries. One way of doing this would be to aggregate the rows by their `repository_n…
-
## Description
Several ingestion days during a recent run of `metropolitan_museum_reingestion_workflow` have raised the following error:
```
File "/opt/airflow/openverse_catalog/dags/prov…
-
If you do a search for Artist Rembrandt and repository Metropolitan Museum of Art, you get 3 results. (URL too long to put in here). Why doesn't it show this record?
https://artresearch.net/resou…
-
## Description
Since migrating from Creative Commons, we have had very few providers running consistently. We would like to get these DAGs in a consistently operational state so that we are regular…
-
## Description
On each run of the `metropolitan_reingestion_workflow`, a handful of reingestion days fail during the `create_loading_table` step due to the error:
```
psycopg2.errors.Duplicat…
-
This is a question of a user / editor to me, which I ask here:
Is it possible to leave certain fields prefilled in a kind of template when creating an image medium?
For example, when uploading abo…
-
As an exhibit creator for the coming Martin Wong Catalog Raisoneé exhibit, I need italics to display in the metadata record for journal titles, book titles and titles of exhibitions, on both the PURL …
-
## Description
Some DAGs are broken. This can happen for a few reasons:
- A provider releases a new version of their API
- A scraping-based provider changes the HTML structure of their site
…