clowder-framework / clowder

A data management system that allows users to share, annotate, organize and analyze large collections of datasets. It provides support for extensible metadata annotation using JSON-LD and a distribute analytics event bus for automatic curation of uploaded data.
https://clowderframework.org/
University of Illinois/NCSA Open Source License
37 stars 17 forks source link

same extractor shows up multiple times #327

Closed robkooper closed 2 years ago

robkooper commented 2 years ago

Describe the bug

When you have multiple instances of clowder running the heartbeat of a new extractor could result into the same extractor being addd multiple times. This seems to be a race condition where we check to see if the extractor exists, and if it does not it will add. With multiple instances of clowder this could result in the same extractor being added multiple times.

To Reproduce

  1. Make sure extractor does not exist
  2. Run multiple clowder instances
  3. Start extractor
  4. Might show up the same extractor multiple times in extractor list

Expected behavior Extractor should only show once.

Additional context Might need a second process that will check and remove any duplicate instances of the same extractor.

robkooper commented 2 years ago

One solution discussed with @ddey2 is the notion of a primary clowder instaance, if we go this route maybe use this for elasticsearch as well.

robkooper commented 2 years ago

can probably be tested by setting primary to true/false in custom.conf.

ddey2 commented 2 years ago

Added the same logical concept for elasticsearch