Closed agpul closed 9 months ago
My proposal to fit the Figma, based on information contained in the C. v4.00 EOSC Multi-Provider Cat_409f63efbd6a44ffac0406c8c590dff5-181023-1041-6200.pdf and ruby schema here includes the output:
catalogue_output_schema = {
"abbreviation": "string",
"description": "string",
"id": "string",
"keywords": "array<string>",
"keywords_tg": "array<string>",
"title": "string",
"type": "string",
"scientific_domain": "array<string>",
"scientific_subdomain": "array<string>",
"structure_type": "array<string>",
"legal_status": "array<string>",
}
Which will be converted into this Solr schema:
<field name="abbreviation" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="description" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="id" type="string" termPositions="true" termVectors="true" termOffsets="true" required="true" useDocValuesAsStored="true"/>
<field name="keywords" type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name="keywords_tg" type="text_general" indexed="true" stored="true" termPositions="true" termVectors="true" termOffsets="true" required="false"/>
<field name="title" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="title_str" type="lowercase" indexed="true" stored="true"/>
<field name="type" type="string" indexed="true" stored="true"/>
<field name=“scientific_domain” type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name=“scientific_subdomain” type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name=“structure_type” type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name=“legal_status” type="strings" indexed="true" useDocValuesAsStored="true"/>
Currently, the legal_status
field is missing, but it will be added in the future. However, there is no obstacle to adding it to Solr at this time.
The Figma includes an explanation for each field
NI4OS -> abbreviation National Ini... -> title Catakig -> type Scientific Domain -> scientific_domain Scientific Subdomain -> scientific_subdomain Structure Type -> structure_type Legal Status -> legal_status (right now not aviable) National Initiatives for Open Science..... -> description
I also need a decision on which fields, not included in the figma, will be used as filters and added to the schema.
The following fields, which are both in MP, are currently unused:
According to C. v4.00 EOSC Multi-Provider Catalogue Profile there should also be :
The solr schema, as provided by Figma, is as follows:
<field name="abbreviation" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="description" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="id" type="string" termPositions="true" termVectors="true" termOffsets="true" required="true" useDocValuesAsStored="true"/>
<field name="keywords" type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name="keywords_tg" type="text_general" indexed="true" stored="true" termPositions="true" termVectors="true" termOffsets="true" required="false"/>
<field name="title" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="type" type="string" indexed="true" stored="true"/>
<field name=“scientific_domains” type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name=“structure_type” type="strings" indexed="true" useDocValuesAsStored="true"/>
<field name=“legal_status” type="strings" indexed="true" useDocValuesAsStored="true"/>
The output is as follows:
catalogue_output_schema = {
"abbreviation": "string",
"description": "string",
"id": "string",
"keywords": "array<string>",
"keywords_tg": "array<string>",
"title": "string",
"type": "string",
"scientific_domains": "array<string>",
"structure_type": "array<string>",
"legal_status": "array<string>",
}
This is my final proposal. @agpul
Acceptance criteria:
This issue is blocked by: https://github.com/cyfronet-fid/marketplace/issues/3068