Sage-Bionetworks / sage-monorepo

Where OpenChallenges, Schematic, and other Sage open source apps are built
https://sage-bionetworks.github.io/sage-monorepo/
Apache License 2.0
23 stars 12 forks source link

[Story] Store Kaggle challenges into a DB #1246

Open tschaffter opened 1 year ago

tschaffter commented 1 year ago

What projects is this story for?

OpenChallenges

As a user, I want

As an OpenChallenges admin, I want to review the information of new Kaggle challenges so that they can later be shown to users.

Description

We have a prototype service that pulls Kaggle challenges from Kaggle API and push them to a Kafka topic. The goal of this story is to create another service that consume the Kafka topic and write challenges to a database as they arrive.

Acceptance criteria

Tasks

Anything else?

No response

Have you linked this story to a GitHub Project?

tschaffter commented 1 year ago

One option may be to store the original Kaggle challenges in a document DB such as MongoDB. This would provide us a greater flexibility in terms of how we want to use Kaggle data. For example, we may currently ignore a field from the original Kaggle challenge objects but we may find a use for it in the future.

tschaffter commented 1 year ago

Moved to Backlog

tschaffter commented 12 months ago

Added to Sprint 23.10.

tschaffter commented 12 months ago

Elasticsearch enables to store unstructured document so we could store raw Kaggle challenges to ES, then process them to be added to OC DB.

tschaffter commented 11 months ago

Added to Backlog