Train a machine learning model to classify images into art styles: manga-color, manga-ink, hentai, other (can add others in the future)
The data will be from our SQL database - aka the media attachments - and the labels will be done manually.
Create a data pipeline to download images from twitter url, convert to tensors, scale down images, crop images, potentially augment images with slight rotations or shifts or zooms, brightness and contrast maybe? (start with a initial template, don't augment too much in the beginning).
Split data into train, validation, test sets before applying data pipeline (Start with 1000 train, 200 validation, 400 test for all classes - try to get even splits in terms of number of images - but read papers and experiment to see if this affects accuracy of model).
Train model using transfer learning from huggingface and be prepared to subclass or alter the models.
Consider ensemble.
When model reaches above 90% accuracy it should be fine - we will implement active learning pipeline in the future and admin the website daily.
Train a machine learning model to classify images into art styles: manga-color, manga-ink, hentai, other (can add others in the future) The data will be from our SQL database - aka the media attachments - and the labels will be done manually.
Create a data pipeline to download images from twitter url, convert to tensors, scale down images, crop images, potentially augment images with slight rotations or shifts or zooms, brightness and contrast maybe? (start with a initial template, don't augment too much in the beginning).
Split data into train, validation, test sets before applying data pipeline (Start with 1000 train, 200 validation, 400 test for all classes - try to get even splits in terms of number of images - but read papers and experiment to see if this affects accuracy of model).
Train model using transfer learning from huggingface and be prepared to subclass or alter the models.
Consider ensemble.
When model reaches above 90% accuracy it should be fine - we will implement active learning pipeline in the future and admin the website daily.