STRIDES / NIHCloudLabAzure

Documentation and tutorials on using Azure for biomedical research
3 stars 2 forks source link

Verify and fix 'import' issues in notebooks. #91

Open furniturewalatkNIH opened 6 months ago

furniturewalatkNIH commented 6 months ago

We need to have an automated way to verify the notebooks.

One part of this is the verification of 'imports' in them.

Need to create a workflow/job to run through all notebooks and check/verify each.

furniturewalatkNIH commented 6 months ago

Created new workflow - https://github.com/STRIDES/NIHCloudLabAzure/actions/workflows/verify-imports-jupyter-nbmake.yml

furniturewalatkNIH commented 6 months ago

`============================================================= FAILURES ============================================================= _ /workspaces/NIHCloudLabAzure/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_withconsole.ipynb [gw0] linux -- Python 3.10.13 /usr/local/python/3.10.13/bin/python3

connect to vector store

from azure.search.documents import SearchClient from azure.core.credentials import AzureKeyCredential

search_client = SearchClient(endpoint, index_name, AzureKeyCredential(index_key))

ModuleNotFoundError Traceback (most recent call last) Cell In[4], line 2 1 #connect to vector store
----> 2 from azure.search.documents import SearchClient 3 from azure.core.credentials import AzureKeyCredential 5 search_client = SearchClient(endpoint, index_name, AzureKeyCredential(index_key))

ModuleNotFoundError: No module named 'azure' _ /workspaces/NIHCloudLabAzure/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb __ [gw0] linux -- Python 3.10.13 /usr/local/python/3.10.13/bin/python3

import os from openai import AzureOpenAI import dotenv import requests import numpy as np import pandas as pd

ModuleNotFoundError Traceback (most recent call last) Cell In[2], line 3 1 import os 2 from openai import AzureOpenAI ----> 3 import dotenv 4 import requests 5 import numpy as np

ModuleNotFoundError: No module named 'dotenv' _____ /workspaces/NIHCloudLabAzure/notebooks/GenAI/notebooks/Pubmed_RAG_chatbot.ipynb __ [gw0] linux -- Python 3.10.13 /usr/local/python/3.10.13/bin/python3

create your SAS token

from datetime import datetime, timedelta from azure.storage.blob import BlobServiceClient, generate_account_sas, ResourceTypes, AccountSasPermissions start_time = datetime.utcnow() expiry_time = start_time + timedelta(hours=2) sas_token = generate_account_sas( account_name=storage_account_name, container_name=container_name, account_key=key, resource_types=ResourceTypes(object=True), permission=AccountSasPermissions(read=True, write=True, delete=True, list=True, add=True, create=True), expiry=expiry_time, start=start_time )

ModuleNotFoundError Traceback (most recent call last) Cell In[10], line 3 1 # create your SAS token 2 from datetime import datetime, timedelta ----> 3 from azure.storage.blob import BlobServiceClient, generate_account_sas, ResourceTypes, AccountSasPermissions 4 start_time = datetime.utcnow() 5 expiry_time = start_time + timedelta(hours=2)

ModuleNotFoundError: No module named 'azure.storage' __ /workspaces/NIHCloudLabAzure/notebooks/GenAI/notebooks/AzureAIStudio_sql_chatbot.ipynb __ [gw0] linux -- Python 3.10.13 /usr/local/python/3.10.13/bin/python3

import pyodbc

server_name = "" username = "" password = "" database = "" driver= '{ODBC Driver 18 for SQL Server}'

conn = pyodbc.connect(f'DRIVER={driver};PORT=1433;SERVER={server_name}.database.windows.net;PORT=1443;DATABASE={database};UID={username};PWD={password}')

ModuleNotFoundError Traceback (most recent call last) Cell In[7], line 1 ----> 1 import pyodbc 3 server_name = "" 4 username = ""

ModuleNotFoundError: No module named 'pyodbc' _ /workspaces/NIHCloudLabAzure/notebooks/pangolin/pangolin_pipeline.ipynb __ [gw0] linux -- Python 3.10.13 /usr/local/python/3.10.13/bin/python3

import libraries

import os from Bio import SeqIO from Bio import Entrez import ipyrad as ipa import toytree

ModuleNotFoundError Traceback (most recent call last) Cell In[7], line 5 3 from Bio import SeqIO 4 from Bio import Entrez ----> 5 import ipyrad as ipa 6 import toytree

ModuleNotFoundError: No module named 'ipyrad' ========================================================= warnings summary ========================================================= notebooks/GenAI/notebooks/sharepoint_RAG_bot.ipynb:: /home/codespace/.local/lib/python3.10/site-packages/nbformat/init.py:96: MissingIDFieldWarning: Cell is missing an id field, this will become a hard error in future nbformat versions. You may want to use normalize() on your notebooks before validations (available since nbformat 5.1.4). Previous versions of nbformat are fixing this issue transparently, and will stop doing so in the future. validate(nb)

../../usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/cacheprovider.py:461 /usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/cacheprovider.py:461: PytestCacheWarning: could not create cache path /workspaces/NIHCloudLabAzure/.pytest_cache/v/cache/nodeids: [Errno 28] No space left on device: '/workspaces/NIHCloudLabAzure/pytest-cache-files-tklqbmxd' config.cache.set("cache/nodeids", sorted(self.cached_nodeids))

../../usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/cacheprovider.py:413 /usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/cacheprovider.py:413: PytestCacheWarning: could not create cache path /workspaces/NIHCloudLabAzure/.pytest_cache/v/cache/lastfailed: [Errno 28] No space left on device: '/workspaces/NIHCloudLabAzure/pytest-cache-files-b91m919k' config.cache.set("cache/lastfailed", self.lastfailed)

../../usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/stepwise.py:57 /usr/local/python/3.10.13/lib/python3.10/site-packages/_pytest/stepwise.py:57: PytestCacheWarning: could not create cache path /workspaces/NIHCloudLabAzure/.pytest_cache/v/cache/stepwise: [Errno 28] No space left on device: '/workspaces/NIHCloudLabAzure/pytest-cache-files-zp6jr5yu' session.config.cache.set(STEPWISE_CACHE_DIR, [])

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

Learn more about nbmake at https://github.com/treebeardtech/nbmake

===================================================== short test summary info ====================================================== FAILED notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb:: FAILED notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb:: FAILED notebooks/GenAI/notebooks/Pubmed_RAG_chatbot.ipynb:: FAILED notebooks/GenAI/notebooks/AzureAIStudio_sql_chatbot.ipynb:: FAILED notebooks/pangolin/pangolin_pipeline.ipynb:: ======================================= 5 failed, 6 passed, 4 warnings in 759.42s (0:12:39) ========================================`