TimKoornstra / FinTwitBERT

FinTwitBERT: Specialized BERT Model for Financial Twitter Analysis. Trained on vast financial tweets, it's ideal for sentiment analysis, trend prediction, and financial NLP tasks.
MIT License
3 stars 0 forks source link

Increase number of datasets #6

Open StephanAkkerman opened 8 months ago

StephanAkkerman commented 8 months ago

Look on https://hf.co/datasets for more useful datasets

Unlabeled:

Twitter sentiment datasets (similar to tweet-eval):

Labeled:

StephanAkkerman commented 7 months ago

Not used: https://sobigdata.d4science.org/catalogue-sobigdata?path=/dataset/crypto_related_tweets_from_10_10_2020_to_3_3_2021 -> very big (March alone is 23GB) https://zenodo.org/records/3895021 -> only contains Tweet IDs

StephanAkkerman commented 6 months ago

Combined pre-training dataset available on: https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets