microsoft / MCW-Analyzing-text-with-Azure-Machine-Learning-and-Cognitive-Services

MCW Analyzing text with Azure Machine Learning and Cognitive Services
MIT License
94 stars 116 forks source link

October 2021 suggestions #52

Closed timahenning closed 2 years ago

shirolkar commented 2 years ago
  1. Update the lab show the following new text-analytics features:
    • Sentiment Analysis
    • Opinion mining
    • Key phrase extraction
    • Language detection
    • PII detection
  2. Update the lab to also include Azure Automated ML for text classification. We can highlight AutoML no code experience in AML Studio.
  3. Update the WDS section to reflect all the changes.
bdleavitt commented 2 years ago
  1. I think we should update the title of the lab to focus more on the actual content (i.e. Custom ML and Pre-built AI for Text Analytics vs. deep learning)

  2. In the "Before the HOL", the instructions tell people to deploy individual cognitive services (text and computer vision), but the HOL step-by-step uses the generic cognitive services resource. This causes some confusion with the steps related to Notebook five and the format/method of calling the endpoint. I think both are valid, but we should just choose one or the other.

  3. I would recommend update the images/instructions to use the Azure ML notebooks UI instead of JupyterLab.

  4. The current default kernels (AML 3.6) in jupyter are exiting support in December 2021. It may make sense to instruct users to use the AML Python 3.8 environment. https://azure.microsoft.com/en-us/updates/community-support-for-python-36-is-ending-on-23-december-2021/#:~:text=Community%20support%20for%20Python%203.6%20is%20ending%20on,it%20will%20be%20unsupported%20after%2023%20December%202021. We may need to update the instructions for pip installing packages in order to install to the 3.8 kernel.

  5. What about a requirements.txt file instead of separate cells in the 00 Init.pynb?

  6. Managed online endpoints are coming, and will largely supplant ACI as a deployment target. Does it make sense to use them in the lab? If not, then at least adding some discussion in the whiteboarding session will help. https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-managed-online-endpoint-studio https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-managed-online-endpoints

  7. Notebook 2: after deploying to ACI, add the code that would allow someone to reference the webservice endpoint without depending on the deployment step [It took a lot longer than expected for the ACI deployment, and when I came back to it my session had timed out. So, a cell with the code to reference an existing endpoint would help.]

    i.e. `from azureml.core import Model, Webservice, Workspace from azureml.exceptions import WebserviceException

    ws = Workspace.from_config()

    # Create a python object to reference the deployed webservice by name. service_name = "summarizer" webservice = Webservice(ws, service_name) `

  8. Notebook 3: there was an issue installing/importing openpyxl, which is a dependent package using AML Python 3.8 Kernel. Was able to install using conda install, but pip I could only figure out how to install to the AML Python 3.6 kernel.

  9. Notebook 3: Should probably call out that the deployed compute cluster is set to always have one node running -- this will create an ongoing cost in the subscription if the participant doesn't modify or delete the cluster; BUT, it's good for the lab so you don't wait as long for nodes to provision / deprovision.

  10. Notebook 3: the cluster training job takes a long time on the first run because the docker image is being created. Maybe call that out.

  11. Notebook 3: FYI - RunDetails(run).show() -- widget didn't display in the Azure ML notebook. Product team has confirmed that's a bug and they'll take a look at it. It's probably worthwhile to also show particpants where to find the same information in the studio UI (Experiments > {Experiment Name} > {Run Name}

  12. Notebook 5: why are we using the assert command on the api key variables?

  13. Notebook 5: vision endpoint for the standalone computer vision service didn't work with the REST code supplied. Instead had to use the https://eastus2.api.cognitive.microsoft.com/ format instead. It didn't seem to matter if I did v3.0 or v3.1, but we should probably test on the latest version.

    vision_endpoint = 'https://eastus2.api.cognitive.microsoft.com/' #"" vision_base_url = vision_endpoint + "vision/v3.1/" vision_analyze_url = vision_base_url + "analyze"

shirolkar commented 2 years ago

@bdleavitt I have upgraded all the notebooks to use the Python 3.8 kernel. However, it appears that the AutoML is incompatible with Python 3.8 (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-auto-train#prerequisites). So while I can train an AutoML model on remote compute from the notebook, I cannot download and use the trained model in the notebook. Furthermore, if I limit to using onnx compatible models, then it prevents from using DNN for text. So I am thinking of only showing AutoML training from the AML Studio UI - minimum time it runs for is 1+ hour. The students can start the experiment run and continue on to the rest of the lab and once AutoML run completes they can review the results in the AML Studio. Thanks. Jitendra

DawnmarieDesJardins commented 2 years ago

Closing issue with merge of PR #53