fastai / nbdev

Create delightful software with Jupyter Notebooks
https://nbdev.fast.ai/
Apache License 2.0
4.93k stars 487 forks source link

Safe migration from nbdev 1 for a slightly modified setup #998

Closed Rahuketu86 closed 2 years ago

Rahuketu86 commented 2 years ago

I am looking for some guidance and clarification on migrating a repository created with nbdev 1 and customized for additional workflow steps. I have followed the instructions on nbdev1 migration tutorial but have identified some issues with which I need some help.

My setup consists of a remote ubuntu server. I usually work on jupyterlab which is exposed over the host IP 0.0.0.0 and prespecified port. In addition to regular nbdev generated docs I also have a jupyterbook folder where I maintained a recipe kind of documentation which could include more things not part of regular library setup.

Issues :

  1. _docs folder is not automatically created with nbdev_preview. In order for preview to work I needed to manually create a folder and then run nbdev_preview [ This explicit step is missing in tutorial]
  2. sidebar is not created in newly generated doc. What is needed to explicitly create the same? My generated site looks like this image

Old site looked like this

image
  1. I am unable to explicitly set host as 0.0.0.0 in nbdev_preview. It would be nice to have that option available in terminal command
  2. After checking nbdev_template repository my guess is (3) can be solved by introducing a custom _quarto.yml. How do I do the same safely without conflicting with settings.ini.[ Which is preferred in event of conflict.

My current settings.ini is as follows :-

[DEFAULT]
# All sections below are required unless otherwise specified
host = github
lib_name = aiking
# For Enterprise Git add variable repo_name and company name
# repo_name = analytics
# company_name = nike

user = rahuketu86
description = A library for checking data quality issues
keywords = Data Quality, Anomaly Detection, outlier, Fastai 
author = Rahul Saraf
author_email = rahuketu86@gmail.com
copyright = Rahul Saraf
branch = master
version = 0.0.1
min_python = 3.6
audience = Developers
language = English
# Set to True if you want to create a more fancy sidebar.json than the default
custom_sidebar = True
# Add licenses and see current list in `setup.py`
license = apache2
# From 1-7: Planning Pre-Alpha Alpha Beta Production Mature Inactive
status = 2

# Optional. Same format as setuptools requirements
# requirements = 
requirements = fastcore fastai>=2.3 seaborn plotnine altair plotly dash jupyter-dash xlrd>=1.2.0 openpyxl pyarrow sqlalchemy hvplot datashader dask[complete] sklearn pyod torch torchvision python-dotenv kaggle  fastdot opencv-python scikit-image imutils moviepy lifelines beautifulsoup4 rich[jupyter] nltk emoji trax 
# azure-cognitiveservices-search-imagesearch
# selenium

dev_requirements = GitPython jupyter-book sphinx-click sphinx_inline_tabs sphinxext-rediraffe~=0.2.3 wandb nbdev<2.0.0 jupyterlab-myst
# Optional. Same format as setuptools console_scripts
# console_scripts = 
# Optional. Same format as setuptools dependency-links
# dep_links = 

###
# You probably won't need to change anything under here,
#   unless you have some special requirements
###

# Change to, e.g. "nbs", to put your notebooks in nbs dir instead of repo root
nbs_path = nbs
doc_path = docs

# Whether to look for library notebooks recursively in the `nbs_path` dir
recursive = False

# Anything shown as '%(...)s' is substituted with that setting automatically
# doc_host =  https://%(user)s.github.io
#For Enterprise Git pages use:  
#doc_host = https://pages.github.%(company_name)s.com.  
url = https://aiking.zealmaker.com
doc_host = https://aiking.zealmaker.com

#doc_baseurl = /%(lib_name)s/
doc_baseurl = /
# For Enterprise Github pages docs use:
# doc_baseurl = /%(repo_name)s/%(lib_name)s/

git_url = https://github.com/%(user)s/%(lib_name)s/tree/%(branch)s/
# For Enterprise Github use:
#git_url = https://github.%(company_name)s.com/%(repo_name)s/%(lib_name)s/tree/%(branch)s/

lib_path = %(lib_name)s
title = %(lib_name)s

#Optional advanced parameters
#Monospace docstings: adds <pre> tags around the doc strings, preserving newlines/indentation.
#monospace_docstrings = False
#Test flags: introduce here the test flags you want to use separated by |
#tst_flags = 
tst_flags = slow|cpp|cuda|ignore
#Custom sidebar: customize sidebar.json yourself for advanced sidebars (False/True)
#custom_sidebar = 
#Cell spacing: if you want cell blocks in code separated by more than one new line
#cell_spacing = 
#Custom jekyll styles: if you want more jekyll styles than tip/important/warning, set them here
jekyll_styles = note,warning,tip,important

and Makefile is as follows

.ONESHELL:
SHELL := /bin/bash
SRC = $(wildcard nbs/*.ipynb)

all: aiking docs mybook

soft: aiking docs

aiking: $(SRC)
    nbdev_build_lib
    touch aiking

sync:
    nbdev_update_lib

docs_serve: docs
    cd docs && bundle exec jekyll serve --host 0.0.0.0

docs: $(SRC)
    nbdev_build_docs
    touch docs

test:
    nbdev_test_nbs

release: pypi conda_release
    nbdev_bump_version

conda_release:
    fastrelease_conda_package

pypi: dist
    twine upload --repository pypi dist/*

dist: clean
    python setup.py sdist bdist_wheel

clean:
    rm -rf dist

mybook:
    echo "Building Book"
    jupyter-book build book/

jl:
    echo "Running Jupyter"
    jupyter lab --ip 0.0.0.0 --port 9000 --no-browser &
  1. How do I safely introduce steps in GitHub workflow/ deploy the website to netlify? [ I can still follow the old way of reading the repo folder]. But having played with the quarto publishing framework. I think it would be easier to use the same. Current workflow with composite action seems comprehensive enough but any guidance for the same is appreciated My present action looks like the following:-
    
    name: CI
    on: [push, pull_request]

concurrency: group: main cancel-in-progress: true

jobs: build: runs-on: ubuntu-latest steps:

Rahuketu86 commented 2 years ago

Solved 3 Turns out if I set

custom_sidebar = False

and then run nbdev_sidebar --force

It generates a _quarto.yml and sidebar.yml with the right settings in nbs directory as follows

image

Perhaps adding the same in documentation could be helpful to others. (4) Still doesn't work

image

Also looking for some guidance on GitHub actions (5) though

Rahuketu86 commented 2 years ago

Resolved 4 Steps :

  1. Set custom_sidebar = false in settings.ini
  2. Generate sidebar.yml and _quarto.yml using nbdev_sidebar --force
  3. Update setting.ini to set custom_quarto_yml: True
  4. Update _quarto.yml to include host: "0.0.0.0"
image

I still think nbdev_preview should have a commandline option for host and default _quarto.yml should be generated with a host field to avoid all the roundabout steps like above.

Still looking for recommendations on (5) - Github Actions and Netlify deployment

Rahuketu86 commented 2 years ago

(6) How to forcibly clean / stripout notebooks. I keep getting the following error

image

I have already

  1. Installed nbdev_install_hooks on the repo.
  2. Modified notebooks. Built the doc using nbdev_preview followed by nbdev_prepare
  3. Run nbdev_clean on repo before committing

I have gone through the cycle a few times without much success

image

My test.yaml

name: CI
on:  [workflow_dispatch, pull_request, push]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - name: Install Dependencies
      run: |
        sudo apt install -y graphviz
    - name: Fastai CI
      uses: fastai/workflows/nbdev-ci@master
deven367 commented 2 years ago

To get rid of the warnings for clean_ids, set clean_ids = True in settings.ini. I got this answer from Hamel in #990 Also, a good place to check these migrations issues is in the forums, over here

Even I had faced the same issue with the sidebar, I had asked the question on the forums and I was able to resolve it.

Rahuketu86 commented 2 years ago

@deven367 Setting clean_ids=True removed warnings Additionally, I have to manually do nbdev_clean-> nbdev_prepare-> commit every time to make sure CI works ( after stopping nbdev_preview)

However, I don't understand why this is required? I would have imagined nbdev_intall_hooks to set these things as pre-commit hooks.

If I don't do it I get CI failure with the following message

image

Reading code for composite action https://github.com/fastai/workflows/blob/master/nbdev-ci/action.yml I understand warning but not sure what is the optimal workaround for my setup?

Rahuketu86 commented 2 years ago

Additionally for (1) if notebooks are in nbs folder instead of root folder _docs folder is generated inside nbs folder instead of root. I imagine this might break the GitHub deploy action.

My settings are same as described above with following fields


# Change to, e.g. "nbs", to put your notebooks in nbs dir instead of repo root
nbs_path = nbs
doc_path = docs

nbs folder looks as follows 👎

image

(nbs/_docs are not ignored from commit as intended with migration tutorial instructions)

Rahuketu86 commented 2 years ago

Update on (1)

With above settings -> nbs_path = nbs and doc_path = docs

  1. nbdev_preview generates _docs inside nbs folder
  2. nbdev_docs generates _docs inside root folder as expected
  3. I am unable to locate _procs folder as described in https://nbdev.fast.ai/Explanations/docs.html#deploying-your-docs-on-other-platforms . Do I need to enable a new setting?

My nbdev version is nbdev 2.2.10

Rahuketu86 commented 2 years ago

Finally found discord nbdev channel and upcomming version 2.3 release post. https://forums.fast.ai/t/upcoming-changes-in-v2-3/98905 . Looking forward to release . Absolutely love nbdev

deven367 commented 2 years ago

@Rahuketu86 To fix your issue with the docs, you need to update the doc_path in your settings.ini

image

and you need to add _docs in your gitignore, which is included in the final steps, these steps also remove the docs folder

image

Rahuketu86 commented 2 years ago

@Rahuketu86 To fix your issue with the docs, you need to update the doc_path in your settings.ini

image

and you need to add _docs in your gitignore, which is included in the final steps, these steps also remove the docs folder

image

Please look at this comment. https://github.com/fastai/nbdev/issues/998#issuecomment-1241008303

I already did it nevertheless there is a discrepancy in where _docs is formed depending on command you run : nbdev_preview or nbdev_docs (if you have an nbs folder for notebook). I am hoping that this issue will be resolved in next version with _procs directory workflow

Rahuketu86 commented 2 years ago

Version 2.3.0

Request for enhancement

Why not delegate publishing completely to quarto publishing framework? (Can provide a cleaner resolution for (5) as described above)

With the introduction of _proc in version 2.3.0 , nbdev resolves a few issues around publishing. But it still doesn't completely capitalize _quarto publishing functionality.

Let me try to explain When I run the nbdev_preview command a _docs folder is generated in the _proc directory. Looking at the source code I believe same/similar _docs folder is generated by the nbdev_docs command at the root directory.

My thought process is following

  1. Generate a _publish.yml in _proc folder using the quarto publish command.
  2. We can still use the nbdev_docs command but this _proc/_docs folder is updated with everything expected in terms of rendering ( without prior cache etc...)
  3. Modify deploy action at https://github.com/fastai/workflows/blob/master/quarto-ghp/action.yml to something more general which (a) runs nbdev_docs (b) uses something similar to quarto publish action to deploy the docs based on _publish setting. (In fact, default deployment to GitHub pages can be already configured for _publish.yml while generating _proc folder using nbdev_prepare but can be modified by a user on demand based on a case-by-case basis.)

The below setup for deploy has not worked ( might be an issue with quarto-action) - with various combination of path[ _proc, ./_procs] or putting publish in multiple places ( _proc dir, root dir, both etc..)

name: CI
on:
  push:
    paths-ignore:
      - 'book/**'
      - 'nbs_pluto/**'
      - 'nbs_pluto_html/**'
      - 'devops/**'
  pull_request:
  workflow_dispatch:

jobs:
  test-deploy:
    runs-on: ubuntu-latest

    steps:
    - name: Install Dependencies
      run: |
        sudo apt install -y graphviz

    - name: Fastai CI
      uses: fastai/workflows/nbdev-ci@master

    - name: Run nbdev_docs
      shell: bash
      run: |
        nbdev_docs

    - name: Render and Publish 
      uses: quarto-dev/quarto-actions/publish@v2
      with:
        target: netlify
        path: _proc
        NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}

Failing with following message :-

No _publish.yml file available (_publish.yml specifying a destination required for non-interactive publish)

I am using the following to achieve netlify deployment


name: CI
on:
  push:
    paths-ignore:
      - 'book/**'
      - 'nbs_pluto/**'
      - 'nbs_pluto_html/**'
      - 'devops/**'
  pull_request:
  workflow_dispatch:

jobs:
  test-deploy:
    runs-on: ubuntu-latest

    steps:
    - name: Install Dependencies
      run: |
        sudo apt install -y graphviz

    - name: Fastai CI
      uses: fastai/workflows/nbdev-ci@master

    - name: Run nbdev_docs
      shell: bash
      run: |
        nbdev_docs

    - name: Push to netlify site
      uses: nwtgck/actions-netlify@v1.2
      with:
        publish-dir: './_docs'
        production-branch: master
        github-token: ${{ secrets.GITHUB_TOKEN }}
        deploy-message: "Deploy from GitHub Actions"
        enable-pull-request-comment: false
        enable-commit-comment: true
        overwrites-pull-request-comment: true
      env:
        NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
        NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
      timeout-minutes: 1
hamelsmu commented 2 years ago

@Rahuketu86

Request for enhancement: Why not delegate publishing completely to quarto publishing framework?

We do have some pointers regarding how to do something similar Deploying your docs on Other Platforms, which might give you some insight.

Can you please transfer your comment to a seperate GitHub Issue about generalizing the docs publishing Action to support various targets?

hamelsmu commented 2 years ago

@Rahuketu86 just FYI I'm finding it hard to follow this issue at the moment. Can you please update this issue with a summary of what issues you are still facing it is not clear to me what issues are outstanding? Thanks much for your help

seeM commented 2 years ago

I think the original issue is resolved, so I'm going to close this. @Rahuketu86 please feel free to create another issue with your feature request re the quarto publishing framework – it's an interesting idea and I don't want it to get lost among the other comments :)!