Open gimseng opened 4 years ago
Personally I envision the following groups of people and how each group should make the best out of the repo to learn:
Group A: super new to machine learning, maybe a few weeks into learning Python, new to numpy, pandas and etc. I imagine they will mostly read and run thru the python codes. If they are stuck or do not understand some codes or some codes yield error or do not run, they could raise the problem or question in the issue tracker.
Our role is to make sure that they can easily get the code running, a task which personally I think colab makes easier. Also, having people utilise issue ticket and maybe use GitHub a bit could help beginners to be more oriented to git and version control.
Group B: intermediate machine learning people. I hope that they could be the ones criticising our solutions, proposing a better code, contributing new exercise and solutions. Furthermore this serves as a playground to use git. If there are enough such people, the project coverage could grow to involve more advanced exercises, such as cv or nlp or reinforcement learning. These will be valuable even to me personally as I struggle to find good resources curated, maintained and updated regularly in a particular place. To take this further, if some more advanced people want to provide detailed exercise to reproduce state-of-the-art arxiv papers, that would make this repo to go beyond a 'textbook on GitHub' model.
Group C: guru in everything we are doing here. Then, I would love for them to help maintain the repo. Help us run the project better. They can implement better format of the project, better code practice. Provide 'master class' level tutorial exercise. Teach us scalable big data tech and etc. The list probably goes on forever.
Others: I am pretty sure I left out a big group of people. If so, please comment and let us know either below or through an issue ticket. Please let us know how best to help you learn
Perhaps I should polish above and make it into the documentation of ‘how to use this project’
As another use for this project, could the following model/vision be a useful thing:
There are many many many coding examples online (see #19), and many of them are excellent. So far, I haven't found a well-maintained. curated (with comments and perhaps discussions on the pros and cons of each source) and updated (with other GitHub people reproducing arxiv papers) frequently.
For my own learning, I always keep a folder with various topics or projects and sometimes write my own codes / sometimes just copy and run others codes. For e.g. I might have kaggle micro course code example on titanic data, but I also found a few good blog-type code or kagglers' codes which did something better or in more details. On my personal computer, I kept a few subfolder of these things. Could we, with some organized approach, keep a curated code solutions (say to the titanic problem) with good overview documentations and discussion on various code solutions? Another example is, some blogs/textbooks talk about CNN better than the other. So in implementing MNIST solution, we could have (1) some textbookcodes, (2) some PyTorch codes (from PyTorch documentation) and (3) some good post blog implementations and etc.
@gimseng For learners instead of just going through the code they should first try implementing the exercise on their own and then if they get stuck somewhere they can refer to the study material for that project and the solution. We can write a guide for learners just like the one for the contributors
@AjayKhalsa That's a good idea. We should have one contributor's guidelines and one learner's guidelines.
I think we have to standardize or expand on the readme.md
on the exercise of each project. So far, some are terse, some have well-thought-out notebook with instructions and fill-in-the-blank sections. Maybe we should figure out how we should structure the notebook in exercise
. I am open to suggestions/some first draft of format.
I realize that someone who's not in the loop who stumbles across this repo might not know what to do with it. Are they supposed to:
(a) fork it and just read through python codes or
(b) actively contributing exercises or
(c) be maintainers(?)
(d) do nothing?