impresso / impresso-middle-layer

Middle layer API
GNU Affero General Public License v3.0
0 stars 1 forks source link

Solr: modification of fields related to access rights #462

Open e-maud opened 3 days ago

e-maud commented 3 days ago

New information regarding access rights and copyrights

...are coming in the main SOLR document index (solr2), and would need to be reflected in the middle layer.

As defined in the access-right schema, information should be displayed at newspaper and content item levels, and be shipped to both the WebApp and the API / Python Library.

Content item level (SOLR)

This is where modifications happen in solr main index, with modifications and additions of new fields.

Since field values are always the same, some short surrogates are being used. The mapping between the full values and their surrogates is currently in enums.py in the solr repo, but will move to impresso-essentials.

Questions

  1. Are the possible values of copyright_detail_i OK or too cryptic / similar to bitmaps?
  2. Do you want to filter on permitted uses (and then the field need to be indexed)?
  3. The source of truth for mapping is now in solr repo, will go in impresso-essentials as Enum: Ok for you, or you prefer a JSON file?

Newspaper level (MySQL)

Example in current new mysql:

image

These information should continue to be shown in the newspaper page as it is currently.

theorm commented 2 days ago

Hi @e-maud , thanks for the detailed description of the changes. My answers:

Are the possible values of copyright_detail_i OK or too cryptic / similar to bitmaps? Doesn't matter for IML. As long as they are consistent. We can add our own mapping in IML if they need to be changed.

Do you want to filter on permitted uses (and then the field need to be indexed)? We can but I'm not sure if this is needed (I can't think of any scenario right now). I think it's a question for @danieleguido and @mduering.

The source of truth for mapping is now in solr repo, will go in impresso-essentials as Enum: Ok for you, or you prefer a JSON file? We won't be reading the JSON file automatically, so an Enum is fine.

What was the reason for changing access_right_s to data_domain_s? It's easy to change in the code, but then it won't be compatible with the old Solr instance. Is it right to say that the access_right_s field has been deprecated and removed and data_domain_s has been added? It will be easier to treat it this way.

e-maud commented 2 days ago

Hi @theorm,

danieleguido commented 1 day ago

hi @e-maud and @theorm, I agree that adding data_domain_s is better than replacing, no problem to add the other fields. Regarding permitted uses, there is no need to filter on them imo, as we already have data_domain_s