OpenMined / PyDP

The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy
Apache License 2.0
501 stars 139 forks source link

Test Notebooks #298

Closed edwardelson closed 3 years ago

edwardelson commented 4 years ago

Description

This is written to address Issue #294.

TODO

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Checklist

edwardelson commented 4 years ago

@chinmayshah99 the urls in this notebook seems to be invalid. I replaced "demo" to "dev" in order to make it work (https://raw.githubusercontent.com/OpenMined/PyDP/demo/examples/Tutorial%204-Launch_demo/data/01.csv -> https://raw.githubusercontent.com/OpenMined/PyDP/dev/examples/Tutorial%204-Launch_demo/data/01.csv). Suspect that this is a problem since my notebook test seems to fail here.

chinmayshah99 commented 4 years ago

You can check why your tests are failing here: https://github.com/OpenMined/PyDP/pull/298/checks?check_run_id=1118715651#step:13:88

edwardelson commented 4 years ago

You can check why your tests are failing here: https://github.com/OpenMined/PyDP/pull/298/checks?check_run_id=1118715651#step:13:88

Thanks chinmay. Yes I saw this:

... notebook examples/./Tutorial 4-Launch_demo/DP proof.ipynb

url1 = 'https://raw.githubusercontent.com/OpenMined/PyDP/demo/examples/Tutorial%204-Launch_demo/data/01.csv'
----> df1 = pd.read_csv(url1,sep=",", engine = "python")
HTTPError: HTTP Error 404: Not Found

I tried to call that url1 manually in browser, seems to face the same 404 error. Works when I changed PyDP/demo/examples to PyDP/dev/examples though

chinmayshah99 commented 3 years ago

Hey @edwardelson

  1. Formatted strings are not supported in python 3.5, so that needs to be removed. This is causing the build to fail in py 35 for both ubuntu and osx: https://github.com/OpenMined/PyDP/pull/298/checks?check_run_id=1132984488#step:13:105
  2. There is some permissions issue for windows build: https://github.com/OpenMined/PyDP/pull/298/checks?check_run_id=1132984434#step:13:119. @BenjaminDev can you help us out here?
BenjaminDev commented 3 years ago

@edwardelson @chinmayshah99

The issue is that on windows a NamedTemporaryFile file cannot be accessed while it's open. So the call to run the notebook subprocess.check_call(args) should be outside the context managed block (with statement). This will ensure the file is closed and can be accessed by the subprocess.check_call(args) call.

So Change

def _execute_notebook(notebookPath: str) -> bool:
    # convert notebook-under-test in path to a temp notebook and execute it
    with tempfile.NamedTemporaryFile(suffix=".ipynb") as fout:
        args = [
            "jupyter",
            "nbconvert",
            "--to",
            "notebook",
            "--execute",
            "--output",
            fout.name,
            notebookPath,
        ]
        subprocess.check_call(args) # Note: this line must be outside the context managed block.

    # return true if execution is successful
    return True

to

def _execute_notebook(notebookPath: str, delete=False) -> bool: # Note: the change of behaviour here. No longer delete on close. We might want to clean up after all scripts have run but but in this case it's not really needed.
    # convert notebook-under-test in path to a temp notebook and execute it
    with tempfile.NamedTemporaryFile(suffix=".ipynb") as fout:
        args = [
            "jupyter",
            "nbconvert",
            "--to",
            "notebook",
            "--execute",
            "--output",
            fout.name,
            notebookPath,
        ]
    subprocess.check_call(args)

    # return true if execution is successful
    return True

Nitpicks: Some of the variable names are camel case instead of snake case. Example: notebookPaths should be notebook_paths

Otherwise it looks good to me. @edwardelson please can you make these changes and see if the CI is happy.

edwardelson commented 3 years ago

@edwardelson @chinmayshah99

The issue is that on windows a NamedTemporaryFile file cannot be accessed while it's open. So the call to run the notebook subprocess.check_call(args) should be outside the context managed block (with statement). This will ensure the file is closed and can be accessed by the subprocess.check_call(args) call.

So Change

def _execute_notebook(notebookPath: str) -> bool:
    # convert notebook-under-test in path to a temp notebook and execute it
    with tempfile.NamedTemporaryFile(suffix=".ipynb") as fout:
        args = [
            "jupyter",
            "nbconvert",
            "--to",
            "notebook",
            "--execute",
            "--output",
            fout.name,
            notebookPath,
        ]
        subprocess.check_call(args) # Note: this line must be outside the context managed block.

    # return true if execution is successful
    return True

to

def _execute_notebook(notebookPath: str, delete=False) -> bool: # Note: the change of behaviour here. No longer delete on close. We might want to clean up after all scripts have run but but in this case it's not really needed.
    # convert notebook-under-test in path to a temp notebook and execute it
    with tempfile.NamedTemporaryFile(suffix=".ipynb") as fout:
        args = [
            "jupyter",
            "nbconvert",
            "--to",
            "notebook",
            "--execute",
            "--output",
            fout.name,
            notebookPath,
        ]
    subprocess.check_call(args)

    # return true if execution is successful
    return True

Nitpicks: Some of the variable names are camel case instead of snake case. Example: notebookPaths should be notebook_paths

Otherwise it looks good to me. @edwardelson please can you make these changes and see if the CI is happy.

oh yes that works, thanks @BenjaminDev ! I've also updated the variable names.