Open stacimc opened 10 months ago
Hi there! Thanks for flagging this, I've passed this on to our API staff, who have looked into if this could be an issue on Europeana's end. They wanted me to pass on that the cursor values that our API generates are created by Solr. Solr uses Base64 encoding to generate these cursor values. This means that the following characters can appear in the cursor value: "the upper- and lower-case Roman alphabet characters (A–Z, a–z), the numerals (0–9), and the "+" and "/" symbols, with the "=" symbol as a special suffix code."
We therefore believe that the \u001d
character should not be able to be generated by our API cursor generator. I hope this helps narrowing down your issue!
Best
Jolan
Description
Identified in a production Europeana run. It seems like some Europeana cursor values are not being encoded properly, resulting in a 400.
Reproduction
Run the Europeana DAG with the following for
initial_query_params
(but replace the value forwskey
with the correct api key):The DAG will fail immediately.
The cursor is
AoIvLzEwMjgvRTAwMjc3MjQyc/yK+Z+NAw=\u001d
. It is url encoded, resulting in the URL that is requested by the DAG: https://api.europeana.eu/record/v2/search.json?wskey=***&profile=rich&reusability=open&reusability=restricted&sort=europeana_id%2Bdesc&sort=timestamp_created%2Bdesc&rows=100&media=true&start=1&qf=TYPE%3AIMAGE&qf=provider_aggregation_edm_isShownBy%3A%2A&query=timestamp_update%3A%5B2024-01-25T00%3A00%3A00Z+TO+2024-01-26T00%3A00%3A00Z%5D&cursor=AoIvLzEwMjgvRTAwMjc3MjQyc%2FyK%2BZ%2BNAw%3D%1DThe full API response is: