ok so the point of this is to make it so the elastic index can be populated directly from NormalizedData instead of AbstractCreativeWork, to make way for deleting ShareObject and all its kin (including AbstractCreativeWork) -- all without breaking the apps we've built that use the index directly (preprints and registries discover pages -- will be updated to use a new search api once that new api exists)
changes
docker-compose updates
add indexer service
move common environment variables to .docker-compose.env
tell celery its app
share.search tidy-up
cleaner indexer daemon threading, messaging
remove unhelpfully redundant code
move elastic mappings and fetching logic to IndexSetup classes
available IndexSetups added to entry_points in setup.py
allow configuring a different IndexSetup for each index
currently only one share_classic IndexSetup that just wraps
existing behavior
move index creation/deletion to ElasticManager
ensure useful suids for all datums
require suids for pushed data
for compatibility, infer suids for data pushed from OSF
populate_osf_suids command to populate suids for existing OSF data
add FormattedMetadataRecord model
each suid has one FMR for each supported metadata format (see the list in setup.py under entry_points => share.metadata_formats)
the elasticsearch source document is now stored as an FMR during ingest, and the indexer daemon just pulls from that table (makes re-indexing much smoother)
todo
[x] tests for ElasticManager
[x] tests for IndexSetup(s)
[x] (better) tests for SearchIndexerDaemon
[x] postrend_backcompat IndexSetup that builds elastic documents from NormalizedData
ok so the point of this is to make it so the elastic index can be populated directly from
NormalizedData
instead ofAbstractCreativeWork
, to make way for deletingShareObject
and all its kin (includingAbstractCreativeWork
) -- all without breaking the apps we've built that use the index directly (preprints and registries discover pages -- will be updated to use a new search api once that new api exists)changes
indexer
service.docker-compose.env
celery
its appshare.search
tidy-upshare_classic
IndexSetup that just wraps existing behaviorpopulate_osf_suids
command to populate suids for existing OSF dataFormattedMetadataRecord
modelsetup.py
underentry_points => share.metadata_formats
)ingest
, and the indexer daemon just pulls from that table (makes re-indexing much smoother)todo
postrend_backcompat
IndexSetup that builds elastic documents from NormalizedData