dair-ai / dair-ai.github.io

Home of DAIR.AI
https://dair.ai/
MIT License
208 stars 23 forks source link

Call for Contributing Data Science Tutorials #14

Closed omarsar closed 1 year ago

omarsar commented 4 years ago

If you are interested in contributing to democratizing AI research, education, and technologies by publishing tutorials, please propose your ideas below. We will work closely before publishing any data science related tutorials so as to improve standardization, approachability, accessibility, quality, accuracy, and transparency of your work.

To simplify things, we can first work together to publish in this Medium publication, and then everything will be reposted to dair.ai to improve accessibility. If you prefer to directly post on dair.ai, you can submit a PR with your tutorial, following the format used in these posts.

Here is a tutorial example that can give you some ideas of the components that are necessary to publish a great and successful tutorial. And here is a guide that I wrote for helping out data scientists to communicate about their projects.

If you have any ideas on what you want to work on, write them below and we can further discuss more about the process. Ideally, I would prefer tutorials around a widely used and current tool or release/feature, but you can put your suggestions below. Once we have decided to move forward with the project, we can open a separate issue and track the progress via this master issue.

Below I will also post some ideas of tutorials that I have in mind. If you are interested in any of those, you can also work on those. Note that notebooks are the preferred format for tutorials accompanied by a nice summary of it in the form of a blog post.

Feel free to reach out at ellfae@gmail.com or DM on Twitter for more information on how to contribute. 🙏

Ideas:

Here are some examples of tutorials that have been successful in our publication:

abdulrahimq commented 4 years ago

I like this roadmap as a base to work off. I saw that a tutorial on tokenization, lemmatization, and stemming was done but I think I can add more details to that too and also add examples to what you can do for language like Arabic as it isn’t a trivial problem. I can write about different tokenization techniques for a morphological like Arabic. I can show each different stemming algorithm in more detail for example. I would be able to write around 300 words a day or something along these lines.

I’m thinking of doing basic procedures -> NLP basic hypothesis -> string distance -> graph -> document and in the end word embedding and sequential labeling. Do you think this is a good plan?

omarsar commented 4 years ago

Hi @abdulrahimq! These are all great ideas and of course, it would be nice to have content around it, especially for the Arabic language. There is always a lot to learn there. I propose that we open an issue for this task and we can have a deeper discussion about it on Slack or on the issue. In the end, when we have all this material, we can propose a roadmap for learning that consists of notebooks and other beginner materials. That's the ultimate goal and you are spot on. Open the issue here and assign it to yourself and me.

https://github.com/dair-ai/nlp_fundamentals/issues

omarsar commented 2 years ago

Move to Wiki