dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[Website] Embrace d2lbook to generate the website notebooks #1366

Closed sxjscience closed 4 years ago

sxjscience commented 4 years ago

Description

I think we can embrace d2lbook: https://github.com/d2l-ai/d2l-book to generate the website and jupyter notebooks in our website. I tried it with the notebooks in https://github.com/sxjscience/AMLC2020-GluonNLP and find that this is easy to use. Basically, we just need to ensure that we are using the .md format for storing the notebook contents and all outputs are cleaned:

Use the recent tutorial notebooks as an example:

git clone https://github.com/sxjscience/AMLC2020-GluonNLP.git
cd AMLC2020-GluonNLP
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to markdown */*.ipynb
sed -i 's/```python/```{.python .input}/g'  */*.md
cd 02_text_prediction
d2lbook build html

One problem is that the additional files like squad_utils.py cannot be added in the notebook.

What do you think about adopting d2lbook? @leezu @barry-jin @szha

sxjscience commented 4 years ago

Update: need to use the master version of d2lbook: https://github.com/d2l-ai/d2l-book

barry-jin commented 4 years ago

I have investigated into this. It's true that d2lbook is easy to use. It has a building pipeline including compiling notebooks, building html and deploying on S3, only with few commands. The configurations are also easy to maintain in config.ini instead of conf.py. But, for using d2lbook in gluon-nlp, I have following two concerns.

  1. d2lbook is not stable enough to use. I tried to use d2lbook-0.1.16 and d2lbook-0.1.17 to build website for gluon-nlp repository, they all failed because of the current documentation structure, which means we need to either refactor our current documentation file structure to fit d2lbook or modify the code in d2lbook to fit our structure. It takes efforts.

  2. I have built the CI workflow to build website and deploy website for developer preview in #1327 , which fit our current documentation well.

Maybe, we can still use current workflow to build our website.