ethen8181 / machine-learning

:earth_americas: machine learning tutorials (mainly in Python3)
MIT License
3.17k stars 650 forks source link

Convert machine-learning to an Airbnb Knowledge Repo #4

Closed chang closed 6 years ago

chang commented 6 years ago

This script will automatically convert IPython notebooks to the Airbnb Knowledge Repo format, injecting the required header and adding them to the knowledge repo. It currently works only with the .ipynb files in the repository.

A bit more needs to be done to get it into working order:

@ethen8181

ethen8181 commented 6 years ago

Cool! I'll look into this more over the weekend. But here are some initial thoughts based on glancing through the code:

  1. I don't foresee adding any more .Rmd docs in the future, can we remove the inheritance of the KnowledgeRepo class?
  2. Why use @property as oppose to normal class attribute.
  3. I am fine with completely removing the tldr section, thus we can get rid of the random_words part and for both the tldr and the author.
  4. Ideally, I would like to follow sklearn's initialization, tldr, there should be no code logic even input validation in the __init__ method, all it does is assign the input argument to a class attribute.
  5. We could probably merge the convert and write method into 1 single method, and let it accept the path as an input argument. I think it makes sense when following sklearn's pattern, where we call .fit on the input data, thus here we call .convert on the input path. This would also avoid creating a new converter class for every single path since we could just pass the different path to the convert method.

Love to hear your thoughts ~

chang commented 6 years ago

Good points!

  1. Converting .Rmd to knowledge posts is different from .ipynb in that they need to be rendered through R. Since there are hardcoded filepaths in many of the notebooks, there was no easy way for me to do that. If they run in your environment as-is, however, we could easily convert them through the KR CLI.
  2. Just style. They're computed attributes so implementing them with @property is preferred over get_github_link(), get_start_date(), etc...
  3. Done, with the caveat that there needs to be some non-whitespace character in those fields for the KP to render. I think if we're going to deploy this for a wider audience, it would be worth spending an hour or two writing tl;drs.
  4. Done
  5. I'd keep it the way it is, since write() has the side effect of I/O
ethen8181 commented 6 years ago

@ericchang00 Discussion is moved to https://github.com/ethen8181/machine-learning/pull/5