BonfaceKilz / feedanalyser

Fetch info from public feeds
GNU General Public License v3.0
1 stars 2 forks source link

[Feature] Add an NLP filter #12

Open BonfaceKilz opened 4 years ago

BonfaceKilz commented 4 years ago

Is your feature request related to a problem? Please describe.

The current goal is to have a feed that can be plugged in scientific websites and similar places. In this sense, we want to keep it apolitical and refined to a given topic. It would be very in-appropriate if political (or other off-topic)tweets get's jumbled up with the scientific tweets.

Describe the solution you'd like

When a tweet is fetched from twitter, it's first passed through a model that filters out irrelevant tweets based off some form of sentiment/ text analysis. The tweets are then stored in a queue and will later be displayed.

Describe alternatives you've considered

Manually voting tweets. That's what we do now :)

User Stories (optional)

As a scientis, I want to view tweets relavant to my field so that I can stay informed on the given topic area

Feature: Add NLP filter

Scenario: Please use Gherkin here

TODO

Additional context

The solution should be constrained to:

BonfaceKilz commented 4 years ago

Idea: Uprank by scientific vocabulary via latent semantic indexing of one year of PubMed

BonfaceKilz commented 4 years ago

Also, in the meantime, you could use twitter's own advanced search to make queries. To add a list of user, twint allows to filter tweets from specific users by using some flag. I'd need to refactor twint to cater for that, and also break this out into it's own issue :smile:

BonfaceKilz commented 4 years ago

See: https://github.com/ncbi-nlp/BioSentVec