Closed yk closed 1 year ago
Hi @yk, let's do it. From the discussion on #126 I'll be taking this task and proposing a step-by-step solution either tonight or tomorrow.
Hi @yk, let's do it. From the discussion on #126 I'll be taking this task and proposing a step-by-step solution either tonight or tomorrow.
Hey hii Can i also collaborate??
Hi @rohanpatankar926 sure. I'm not sure what the best way to go is for that. A suggestion: we let @dhruv2601 come up with an initial implementation, and then iterate on that? @dhruv2601 might then also switch back to #126 while you can improve the detector.
Hi, if possible I would like to help with this and/or #126
Hi @MattiaSangermano thanks for the interest. See my suggestion above, would that work for you?
Yes, at this point I think it's the best way to proceed
@yk hi, I'd also hope to contribute. I read discussions above and think it makes a lot of sense.
Could zero-shot classification be a solution? "facebook/bart-large-mnli" on HF gives a >0.7 score for @yk's initial post being a request :)
Could zero-shot classification be a solution? "facebook/bart-large-mnli" on HF gives a >0.7 score for @yk's initial post being a request :)
yes it's probably viable to build an ensemble of things like this. depends on how far one can get the noise down
Hi @dhruv2601 , I have written scripts based on 126 to process tweets into conversation threads. If any model has been trained to detect useful instructions, we could then run it on that file to filter it. If you need the file, I can send it to you via discord. I will also update my fork of the repo soon with the code to do all the processing if anyone wants to download dumps and try from their side.
@dhruv2601 any updates on this?
Hey all and @yk, I've trained a model for this task and it works well. Currently, I am working on testing the model on data other than the validation, i.e. on all kinds of instruction styles possible, and I'm taking the help of GPT-JT and ChatGPT for this. It becomes an iterative process when I discover new instruction styles and add them to training data, and repeat.
The action item currently is to prepare a final model, upload it to HF and create a model card and data collection process. Hopefully, I'll update again in a couple of days.
@dhruv2601 thanks a lot for the update. is it possible that you check in the code for this somewhere in the repo under e.g. /model/instruction_detector/
and come to our discord (ping me there) to give a bit of regular updates on it? the issue is, we need to know very accurately what's in the model, both in terms of code and data, in order to use it.
@dhruv2601 Did you use just the twitter data for the model you've trained or used additional datasets?
Hi, due to work and school deadlines, I have been a bit delayed in updating this task. Plan to be more active in a couple of days.
@dhruv2601 would love to test your model. ping in data channel in discord.
@dhruv2601 checking on this again.
Hello @dhruv2601 , is there any update on the instruction detector model?
@yk , seems like this task is better to be accelerated. Though @dhruv2601 is already on this, may I spend some time on a minimal viable example?
This issue stalled, not sure what the relevance is, I remove it from the project board for now.
There is lots of conversational data on the web, for example twitter, reddit, etc. yet only a tiny fraction of it starts with some sort of instruction or request for a task to be fulfilled. We need a system, either a model or a heuristic (or a combination) to classify text as "instruction-like", which would allow us to harvest data from a wide variety of places.