fastai / fastbook

The fastai book, published as Jupyter Notebooks
Other
21.4k stars 8.3k forks source link

fastbook.setup_book() requesting Google Drive access #546

Open Katze2664 opened 1 year ago

Katze2664 commented 1 year ago

In 01_intro.ipynb on Colab, when I run the first cell #hide ! [ -e /content ] && pip install -Uqq fastbook import fastbook fastbook.setup_book() it requests access to my Google Drive.

Specifically it says:

This will allow Google Drive for desktop to: See, edit, create and delete all of your Google Drive files View the photos, videos and albums in your Google Photos Retrieve Mobile client configuration and experimentation View Google people information such as profiles and contacts View the activity record of files in your Google Drive See, edit, create and delete any of your Google Drive documents

Is Google Drive access necessary? It seems like a privacy risk.

thedch commented 1 year ago

Strangely enough, fastbook.__version__ returns 0.0.29 on colab, which is much greater than any version visible on this github (currently I see the latest is 0.0.19, cut on Apr 15). Looking through the codebase, I don't see any implementation of def setup_book. Not sure what's going on that makes this so obfuscated (especially for a teaching resource) -- @sgugger any tips? 🙏

Note that one can download the source code directly from the pypi index and inspect as desired: https://pypi.org/project/fastbook/#files -- but it's weird that the versions aren't reflected on GitHub

The implementation (which is inside __init__.py, curiously) of setup_book just calls setup_colab(), which does :

def setup_colab():
    "Sets up Colab. First run `!pip install -Uqq fastbook` in a cell"
    assert IN_COLAB, "You do not appear to be running in Colab"
    global gdrive
    gdrive = Path('/content/gdrive/My Drive')
    from google.colab import drive
    if not gdrive.exists(): drive.mount(str(gdrive.parent))

(note that I have to copy/paste the implementation instead of linking it, because it doesn't seem to exist on github)

A bit worrisome that there's a global variable there, I can't find any other usage of it (but there may be, which is why we try not to use globals) -- I got curious what this google.colab library does so I looked around and found this: https://pypi.org/project/google-colab/ -- the homepage links to https://colaboratory.research.google.com/ which (!!) does not exist.

So yeah, overall a stack of curious things going on here. I guess I'll try commenting out setup_book and see what happens in the notebook...

sebkraemer commented 1 year ago

It appears the releases on github aren't maintained properly, which explains the lack of version 0.0.29 here.

So if the offending code isn't even here in the repo that means one cannot create a PR aginst it, right?

I guess I'll try commenting out setup_book and see what happens in the notebook...

If the code only serves mounting Google Drive, one might get away with that indeed. The notebook can hardly assume to find anything there. Did you try to run it without that line and did you encounter any problems, @thedch ?

setup_book might be a convenience function for people who put stuff in their cloud drive to use that in the notebooks.

I'm a bit surprised so little people complain or even mention the need to connect to one's Google Drive to run the notebooks on Colab..

I went to run fastbook on Kaggle but that brings some incompatibility problems.

indirectlylit commented 7 months ago

It appears that the fastbook package on pypi is not and has likely never been defined in this repo.

It appears to be defined in the fastai/course20 repo, in the fastbook sub-directory. Independently of the google drive permissions question, it would be better if that package was vendored into this repo so people can easily inspect what code is being run and make modifications as desired.