ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

build: simplified requirements and first two cells of the notebooks #413

Closed victor-wildlife closed 1 week ago

review-notebook-app[bot] commented 3 weeks ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

jannesgg commented 3 weeks ago

@victor-wildlife Great job, I like the idea of compressing this big block of code. I just wonder if we shouldn't keep the "dev" version or rather the "local" version of kso_utils as the default. We could maybe just do a check to see if the folder is available in our current directory, and if not download the packaged version. Otherwise, we might need to do many more releases than we would like to try and "match" the latest dev at all times, when we should rather just do stable releases. Also because when using the notebooks, they are tied to a version of the branch, which is then perhaps confusing when that does not match the version of "kso_utils" in that branch by default. Any thoughts on this?

victor-wildlife commented 3 weeks ago

@jannesgg that makes sense, see the latest commit using functions to tidy things up and using dev/local as default if available

jannesgg commented 3 weeks ago

@jannesgg that makes sense, see the latest commit using functions to tidy things up and using dev/local as default if available

@victor-wildlife Looks better now. It does not seem to work for me locally though. Does it currently work for you?

Also the imports should maybe just be in a try, except block instead of a function? As it is it seems that the tests are failing because they do not pick up imports hidden inside of functions.

victor-wildlife commented 2 weeks ago

@jannesgg what's the issue it comes up with in your local machine? It works for me and in Cloudina as well. I saw there was an issue with the tests... Taking the initial import out of the function might require users to run the same cell twice 1st to try running the import section and add kso_utils to the system and 2nd to try running the import section again. Can you think of a different workaround?

jannesgg commented 2 weeks ago

@victor-wildlife I managed to solve the local issue, had to start with a clean environment for some reason. :)

I know it's an ugly duplication but maybe for now we could just repeat the import block. This should have the same effect as the one you had before and allow the tests to pass. Something like this?


%matplotlib inline

def initiate_dev_version():
    kso_path = os.path.abspath(os.path.join(os.getcwd(), "../.."))
    if os.path.isdir(os.path.join(kso_path, "kso_utils")):
        sys.path.insert(0, kso_path)
        %load_ext autoreload
        %autoreload 2
        print("Development mode ON - kso-utils added to the system.")
    else:
        raise FileNotFoundError("kso_utils directory not found in the expected path.")

def install_kso_utils():
    !pip install -q kso-utils
    # Temporary workaround to install panoptes from the source (avoid requests incompatibility)
    !pip install git+https://github.com/zooniverse/panoptes-python-client.git
    print("Restarting runtime to apply package changes...")
    os.kill(os.getpid(), 9)

try:
    import kso_utils.widgets as kso_widgets
    import kso_utils.project_utils as p_utils
    import kso_utils.yolo_utils as y_utils
    from kso_utils.project import ProjectProcessor, MLProjectProcessor
    print("KSO successfully imported!")
except Exception as e:
    print(f"Error importing kso modules: {e}")
    try:
        initiate_dev_version()
        import kso_utils.widgets as kso_widgets
        import kso_utils.project_utils as p_utils
        import kso_utils.yolo_utils as y_utils
        from kso_utils.project import ProjectProcessor, MLProjectProcessor
        print("KSO successfully imported!")
    except Exception as e:
        install_kso_utils()
victor-wildlife commented 2 weeks ago

@jannesgg I am happy to implement that suggestion

jannesgg commented 2 weeks ago

@victor-wildlife I think everything seems to be coming together nicely. I only noticed that some of the necessary modules are not being imported since some of the notebooks use a different set of utils, e.g. p_utils vs s_utils. Should we just import the max so that each block will be the same for each notebook even if the import is not used? This might cause the tests to fail but I could possibly add an exclusion. Or we just have to adapt each notebook's first block to the required imports for that specific notebook. Any thoughts?

victor-wildlife commented 2 weeks ago

@jannesgg thanks for adding the import sys and os. I am aware importing all the modules might seem easier to develop but I think importing the required modules for each notebook might be the way to go. I am thinking: 1) it will really reflect the modular approach and 2) if any of the external services we use (e.g. zooniverse, mlflow, zenodo) change and we have issues (e.g. doesn't load anymore) the other notebooks might still work. I have updated the imports to be notebook-specific. Let me know what you think. When everything looks good on your end, can you please merge this PR?