Open Team-thedatatribune opened 2 years ago
Hey, I am currently working on ML applications. I have some experience in Data Collection. Please Assign this issue to me.
heya, i'd like to work on writing python scripts for collecting data.
Hey, I am currently working on ML applications. I have some experience in Data Collection. Please Assign this issue to me.
@dharmraj617, we require a diverse dataset of poetic content gathered from various platforms, including:
Your assistance in creating this dataset would be greatly appreciated, with the following key considerations in mind:
For further discussion and information, please join the dyPixa Discord server. We look forward to your valuable contributions! 🙌
heya, i'd like to work on writing python scripts for collecting data.
@Addy000, we currently have a program (here) that's been trained on go_emotions, capable of classifying any given (English) text into one of 28 different emotions.
Now, we're on an exciting new mission. We need a dataset to generate and recommend colors for each of these sentiments. It would be fantastic if you could contribute by providing:
For a detailed description, I recommend visiting issue #58.
You can find the complete list of all 28 emotions at https://huggingface.co/SamLowe/roberta-base-go_emotions. 🎨
I'll assign you the issue if you're interested.
@ravi-prakash1907 i went through it, would like to work on it
Dataset Requirements 📦📋
TL; DR 🥱
This issue is one of the great starting point for the beginners in opensource community, here you can:
Issue Description:
In the context of the
dyPixa
project, this task revolves around the crucial need to gather and comprehensively document datasets for training and testing the machine learning models. This issue addresses the following key aspects:python
(preferred) code for sourcing diverse datasets. This may include acquiring text data from social media, product reviews, and news articles, and images with associated sentiments from public image repositories.Types of Data Needed:
For the NLP and color suggestion models to be highly usable and effective, the following types of data should be considered:
Text Data:
Image Data:
By addressing these components and collecting the appropriate types of data, this issue will lay the foundation for robust machine learning model development and further enhancements in the
dyPixa
project. Your contributions here will greatly advance the project's capabilities. 🚀🌈