ontoportal-lirmm / goo

Graph Oriented Objects (GOO) for Ruby. A RDF/SPARQL based ORM.
http://ncbo.github.io/goo/
Other
0 stars 4 forks source link

Feature: Enhance SOLR integration and add a Schema API #54

Closed syphax-bouazzouni closed 8 months ago

syphax-bouazzouni commented 8 months ago

Prerequisites

Goals

Context

SOLR is the indexing tool, that we use for our search features, it works by defining a collection (a table in the databases world), and for each collection, a schema defines the properties to index by giving its type, a list, or not, ... and also some dynamic or special fields to handle fuzzy search, or other.

The requirement was that we were required to define the collection and the schema, in XML configuration files at the start, and then after we could not change it in the code. This meant we were limited in the action that we could do, and it was hard to add new search features to our system. as these files were static, and you had to update the schema and create the collection configuration files each time you wanted to add something into the index.

This PR, integration the SOLR Schema API, in this project, gives us the option to create/delete a collection and update a collection schema dynamically, the following actions were implemented (see the full list in the SOLR::Admin, SOLR::Schema and SOLR::SchemaGenerator modules in the code):

In Addition to the implementation of the SOLR Schema and admin APIs, we added a dsl to the Goo model, to enable index for any model, either in a schemaless mode or in a custom schema model

Changes

codecov[bot] commented 8 months ago

Codecov Report

Attention: Patch coverage is 81.01604% with 71 lines in your changes are missing coverage. Please review.

Project coverage is 86.36%. Comparing base (1be1c83) to head (84aa3db).

Files Patch % Lines
lib/goo/search/solr/solr_schema.rb 72.34% 26 Missing :warning:
lib/goo/search/search.rb 82.79% 16 Missing :warning:
lib/goo/search/solr/solr_admin.rb 69.76% 13 Missing :warning:
lib/goo/search/solr/solr_query.rb 87.50% 6 Missing :warning:
lib/goo/search/solr/solr_schema_generator.rb 85.36% 6 Missing :warning:
lib/goo.rb 92.30% 2 Missing :warning:
lib/goo/search/solr/solr_connector.rb 91.66% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## development #54 +/- ## =============================================== + Coverage 85.93% 86.36% +0.43% =============================================== Files 41 46 +5 Lines 2702 3022 +320 =============================================== + Hits 2322 2610 +288 - Misses 380 412 +32 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mdorf commented 8 months ago

@syphax-bouazzouni, this looks great! Really useful feature for us as well! It would be great to test it against the latest Solr 9.5.0 to make sure the code is compatible. I am wrapping up my work on a few search enhancements for the RADx project, which we plan to deploy to production shortly. The next major task is upgrading Solr to the latest version, which we may very well coincide with merging this pull request. Are you planning to make any other significant changes to this feature (you mentioned that this is a "first iteration)? Thank you!

syphax-bouazzouni commented 8 months ago

9.5.0

Hello, @mdorf, I tested it locally with Solr 9.5.0 and all the tests are green. I removed the "first iteration" term, as I implemented everything I had in mind for now: administrate SOLR in code, be backward compatible with the existent, and index models automatically at save/delete.

mdorf commented 6 months ago

@syphax-bouazzouni, I am working on merging this functionality into our develop branch. It's a bit tricky given that the pull request is not against our own repo. Probably will need to do a lot of manual merging. Were you planning on submitting this pull request against the ncbo repo? If so, should I wait for that or just proceed with my manual merging? Thank you!

jonquet commented 6 months ago

Hello, the proposition was to move and create PRs directly on the OntoPortal repo now that everyone is positioned under it. I think @syphax-bouazzouni has scheduled some time to create PR related to our work soon.

syphax-bouazzouni commented 6 months ago

@syphax-bouazzouni, I am working on merging this functionality into our develop branch. It's a bit tricky given that the pull request is not against our own repo. Probably will need to do a lot of manual merging. Were you planning on submitting this pull request against the ncbo repo? If so, should I wait for that or just proceed with my manual merging? Thank you!

Hello @mdorf, I would suggest waiting at least 1 month before merging this, as it is only tested in our development environment, and will be released in our next release, see https://github.com/ontoportal-lirmm/ontologies_api/pull/73

Once deployed to our production environment and tested, I will do a PR.

Is it good with you?

mdorf commented 6 months ago

@syphax-bouazzouni, @jonquet, no problem. In general, we are very interested in this feature to facilitate the functionality sought by the RADx project, in which our Solr index would be packaged in a way to be accessible by the third-party API. See https://github.com/bmir-radx/radx-project/issues/49. However, based on my conversation with @alexskr, this does not have the immediate urgency. A month is definitely reasonable for us to wait to be able to merge this feature in its more stable and tested iteration.