Open arthurwolf opened 8 months ago
Hello! Your approach looks good to me, and it sounds like your hard work is paying off. If I was working on this particular project, I would experiment with fine tuning llava once you have a solid dataset to see if it gives better results than OpenAI's models. I have yet to see anyone share a finetune of llava for a specific task, so am curious how well it would work. If you are posting your progress on your project anywhere, please share the link as I am interest to see it in action once you have it all working.
Thanks for the feedback.
I'll soon have about two comic books worth of data which I think would be enough to start fine tuning llava, but I have two issues there: 1. this is all very new, and there are no "easy guide" to fine tuning most things, llava even less, it's all very cryptic and assuming a high technical level, and 2. my assumption for llava is that even fine tuning would require a lot more compute than I can afford.
I've tried an alternative to this: trying to get my data into the llava training dataset for the next llava version. I've opened github issues and sent some emails but so far no answer. I hope I can make it happen, I think it'd benefit not just me, but the model itself also.
About posting progress ,I'm considering starting a youtube channel with some updates, I'll post about it here if/when that happens.
Cheers, and thanks again.
Really cool project.
I'm working on something similar (structurally at least), a manga-to-anime pipeline. It involves a lot of different steps/models, similar to this project:
I'll be looking closer into your project, in particular how it's organized, thanks a lot for sharing. I'd be curious if you have any insights on how you'd do manga reading if you had to.
Cheers!
prompt.json
prompt.txt
reading.json
response.txt
result.json