gimseng / 99-ML-Learning-Projects

A list of 99 machine learning projects for anyone interested to learn from coding and building projects
MIT License
577 stars 174 forks source link

[EXE] Exercise to use huggingface #41

Open gimseng opened 3 years ago

gimseng commented 3 years ago

Some exercises and solution based on using huggingface will be cool. This is an advanced project. For examples, use the library to perform BERT or GPT-2 analysis on some fun NLP-related data.

sophieml commented 3 years ago

Hi,

I'd like the chance to work on this! I've been working with Transformer neural networks in my academic research work, and I'm interested in creating an exercise that uses the pretrained GPT-2 by huggingface to learn to generate poetry.

Learning Goals

Learn how to fine-tune pretrained models to apply to more specific NLP tasks. In particular, someone could use this exercise to compare the results from Transformer architectures with other deep learning techniques, like Andrej Karpathy's char-rnn.

Exercise Statement

Train a neural network that can generate poetry. It should hopefully learn how to imitate proper style and structure (rhyming, metric line, etc).

Prerequisites

[Prerequisites, in terms of concepts or other exercises in this repo] PyTorch, Transfer Learning

Data source/summary:

I might end up scraping a new dataset myself, but this collection of poems from Project Gutenberg looks promising: https://github.com/aparrish/gutenberg-poetry-corpus

gimseng commented 3 years ago

@sophieml Great ! I think its a great idea, thanks for contributing. Please check out the contributing guidelines and some of the previous projects to understand how we organize and format the projects. Looking forward to your PR.

sridhatta commented 2 years ago

Hi @gimseng I am interested to take up this exercise. Please let me know if you can assign it.