A significant portion of the Foreign IDs for images from the Met Museum have been 'randomly shuffled' every time they've been collected since the very beginning.
This blocks our deduplication of those images (See #188 ). The problem is that it will be quite difficult (up to and possibly including impossible) to reassociate the proper Clarifai image tags with the proper images.
Expected behavior
The Foreign ID should never change for a given image.
Screenshots
Additional context
The 'random shuffle' is within the set of images for a given real-world object (so, the IDs for images of a given painting are mixed up).
Of the cases I've observed, the Clarifai tags appear to be related to the same painting, but not the same image of the painting.
If we want to save them, the best process I can think of is:
Modify the script to save the images with a deterministic Foreign ID, and also save metadata about which real-world object the image shows.
Run that script on the entire collection, essentially duplicating again all images from the Met Museum (but in a controlled manner) in the image table (the main table with our image metadata).
Create a new table of all rows from the image table containing all Met Museum metadata.
Delete all rows from the image table containing Met Museum metadata.
At that point, #188 will be unblocked, but we'll have no images from the MetMuseum in the front end if that state persists until the following steps are complete:
Write a script that cleans up and fixes the Met Museum data (this will be complicated, and require serious effort)
Insert the cleaned metadata for the Met Museum into the image table.
Bug Description
A significant portion of the Foreign IDs for images from the Met Museum have been 'randomly shuffled' every time they've been collected since the very beginning.
This blocks our deduplication of those images (See #188 ). The problem is that it will be quite difficult (up to and possibly including impossible) to reassociate the proper Clarifai image tags with the proper images.
Expected behavior
The Foreign ID should never change for a given image.
Screenshots
Additional context
Of the cases I've observed, the Clarifai tags appear to be related to the same painting, but not the same image of the painting.
If we want to save them, the best process I can think of is:
image
table (the main table with our image metadata).image
table containing all Met Museum metadata.image
table containing Met Museum metadata.At that point, #188 will be unblocked, but we'll have no images from the MetMuseum in the front end if that state persists until the following steps are complete:
image
table.