Computer interaction using audio and speechrecognition
MIT License
139
stars
36
forks
source link
Big-picture thought: would it make sense to divide Parrot.py into (i) a simple 'ordinary-user'-facing client and (ii) a framework / lib for advanced training / experimentation with Parrot models? #24
I've been wondering: would it make sense in the long term to divide Parrot.py into
(i) an 'ordinary-user'-facing client that makes it easy to do things like collecting user sounds and doing simple finetuning, and that can be used to finetune an off-the-shelf Parrot model without any machine learning knowledge
and (ii) a framework / library that power users can use for more advanced training and experimentation with Parrot models, and that can be easily set up on a remote GPU (or even just a Google Colab)?
This seems desirable to me, because insofar as I'm experimenting with models, I'm going to want to do that on a GPU, and I'm going to want to do that with a codebase that's reasonably lean --- that doesn't also have code for things like recording data samples --- but that does come with the conveniences discussed in other issues, like config and experiment tracking (#23), or a learning rate finder. And conversely, it probably wouldn't be ideal to expose the complexities of training or finetuning models to an ordinary user who just wants to get a reasonably good model without having to learn about ML or DL.
I've been wondering: would it make sense in the long term to divide Parrot.py into
(i) an 'ordinary-user'-facing client that makes it easy to do things like collecting user sounds and doing simple finetuning, and that can be used to finetune an off-the-shelf Parrot model without any machine learning knowledge and (ii) a framework / library that power users can use for more advanced training and experimentation with Parrot models, and that can be easily set up on a remote GPU (or even just a Google Colab)?
This seems desirable to me, because insofar as I'm experimenting with models, I'm going to want to do that on a GPU, and I'm going to want to do that with a codebase that's reasonably lean --- that doesn't also have code for things like recording data samples --- but that does come with the conveniences discussed in other issues, like config and experiment tracking (#23), or a learning rate finder. And conversely, it probably wouldn't be ideal to expose the complexities of training or finetuning models to an ordinary user who just wants to get a reasonably good model without having to learn about ML or DL.