ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

LitSuggest data curation #629

Open ipediez opened 2 years ago

ipediez commented 2 years ago

Description

I've been testing the LitSuggest tool from NCBI as a discovery tool for new datasets to include in the HCA. This tool performs a weekly selection of published literature available at PubMed, that needs to be manually curated.

Documents

To do

Wkt8 commented 2 years ago

The LitSuggest tool was fed eligible and non-eligible from Ingest in order to train a specific network - it then searches through pubmed and provides a weekly digest of what it thinks is eligible and non-eligible.

Irene is curating what it thinks is eligible, and checking if it is or is not, and then will probably feed back into LitSuggest to further increase it's accuracy.

Help with curating the LitSuggest weekly digest would be great.

Wkt8 commented 2 years ago

Irene working through it with 5 - 10 datasets a day. @ipediez to prepare a short info meeting on this.

ESapenaVentura commented 2 years ago

@ipediez created a poll to have a discussion about NLP and how this litsuggest curation works

ofanobilbao commented 1 year ago

Rather than Stalled, no-one is actively working on this. Either we move it to the Backlog or we close it. But being in stalled does not make sense to me. It's simply not being worked on. @gabsie any thoughts?