EOL / tramea

A lightweight server for denormalized EOL data
Other
2 stars 1 forks source link

data object IDs tangled in versioning- database desynchronisation? #338

Open jhammock opened 7 years ago

jhammock commented 7 years ago

From @hyanwong: here's a weird one. http://eol.org/pages/397688/media shows 2 identical media images with the same data object ID: 33526328. And when that page is checked for images via the API (http://eol.org/api/pages/1.0.json?batch=false&id=397688&images_per_page=1&images_page=1), it returns an older version of the same image (data object id 31527888). There must be some database desynchronisation going on here, or something?

And another case which might be related: http://eol.org/data_objects/29199170 (an old page, which has a newer version) is returned by the pages API for page 6199450. But in this case, both the old and the new data objects link to a different page (http://eol.org/pages/45865727) which very oddly, claims to have no media objects??? Seems like more DB desync problems to me

AmrMMorad commented 7 years ago

Mainly the problem will be in solr. Solr has the old versions of the data objects (media). When the images are shown in the media tab, the exemplar is chosen to be the latest version of this image. So the website shows the latest version one, however in the API the query is executed in solr without adjusting the version. As solr has old versions, API returns the old ones. I will continue investigating this once I will get an access to the search production server. There I can be sure of the last update time for the index,