Priyankakumavat3531 / Practice_Thing

0 stars 0 forks source link

Lab-3_User2 #3

Open Priyankakumavat3531 opened 1 year ago

Priyankakumavat3531 commented 1 year ago

SAFERA LAB-3

Table of Contents

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Sl. No | Topics -- | -- 1 | Objectives 2 | Checkpoints 3 | Getting Started: Set up AutoML training with the Notebook 4 | Model 1: Crime Type Prediction (ClassificationModel) 5 | Model 1: Export dataset from SQL Database Connection 6 | Model 1: Model Training 7 | Model 1: Testing and Evaluation 8 | Model 1: Prediction 9 | Model 2: Time series Model 10 | Model 2: Export dataset from SQL Database Connection 11 | Model 2: Training and Prediction

1. Objectives

This lab will show you how to set up an automated machine learning (AutoML) training job with the Azure Machine Learning Notebook. Automated ML picks an algorithm and hyperparameters for you and generates a model ready for deployment. This lab provides details of the various options that you can use to configure automated ML experiments.

The basic flow diagram for this lab is outlined below. It shows the activities we'll be performing as part of this lab, starting with exporting the dataset required to train the models from Azure SQL Database and then building an AI/ML model using Azure ML Studio.

image

Note: The AI ML models built in this lab are not ready for production; we're only using 4 years of Chicago Crime data for the model training. Hence, the prediction is done using a python notebook.

2. Checkpoints

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

You will require the below artifacts to run through the Lab 3 Sl. No | Required Checkpoints -- | -- 1 | Azure subscription 2 | Cleaned data processed during Lab 1 and saved in SQL table 3 | Azure Resource group 4 | Azure Machine Learning Studio 5 | Compute Instance in ML studio workspace 6 | SQL Database Credentials information like server, database, username, password, driver

3. Getting Started: Set up AutoML training with the Notebooks

1.Open your browser and navigate to https://portal.azure.com/

  1. Sign in to Azure portal using your credentials.

2023-03-11 12_51_38-C__Users_hp_Pictures_Green shots_2023-03-10 23_16_31-Settings png - Greenshot im

  1. You might see a prompt like the one pictured below, Click on Ask Later (Note- Whenever you find this prompt click on the Ask Later option)

2023-03-11 14_09_43-C__Users_hp_Pictures_Green shots_2023-03-11 14_02_02-C__Users_hp_Pictures_Green

  1. Once you are signed in to the Azure portal, click on the Resource groups tab, to select the required resources.

    img_1
  2. Select the SaferaLab2 resource group.

img_2
  1. Once you've selected the SaferaLab2 Resource group, select the SaferaLab2 workspace from the list. If you're having difficulty finding it you can type "workspace" into the "Filter for any field" input and you'll only be shown workspace resources.
img_3
  1. Once you're in the SaferaLab2 workspace, you can click on "Launch Studio" towards the bottom of the screen. It will open up a new tab in your browser.
img_4
  1. Navigate to the left pane. Select Notebooks under the Authoring section. img_5

4. Model 1: Crime Type Prediction (ClassificationModel)

  1. Open the Notebook file as follow :
img_6
  1. Once you open the Notebook file, Click on the + button present on the right side as shown in the image to create a compute instance.
img_7

Note: If the computing machine is already available, then please skip steps between 3 and 6.

    • Write down any name for computing name.
    • Select Virtual machine type as CPU
    • Click on Select from all options
    • Choose any of the given VM
    • Then finally, click on the Create button
img_8_part1 img_8_part2
  1. It will take a few minutes(3-5) to create a compute instance.
img_9
  1. Once the compute is created, you can see the created compute instance in green color.
img_10

(if asks for authentication permission, click on the Authenticate button)

  1. Click on Authenticate button to enable use of Azure SDK.
img_11
  1. Select the kernal as Python 3.8 - AzureML as shown in image.
img_12
  1. So, Click on the "Restart kernal and run all cells" button. This will run all the cells in the notebook.
img_13

if you didn't find that icon, then you can use this method to "Restart kernal and run all cells"

img_14
Instructions to run the individual cells. - Click on the cell that you want to run (You should always run it sequentically)
  1. Shift+Enter : Runs the current cell and select the cell below it or
  2. Ctrl+Enter : Runs the current cell.

5. Model 1: Export dataset from SQL Database Connection

img_15

from above image

Data information (Note: In Screenshot not all columns are captured. Please see dataset information below: )

img_16

Below is the dataset information:

1: REPORT_DATE : Date on which the Offence/crime was reported

2: OCC_DATE : Date on which the Offence/crime was occurred

3: REPORT_YEAR : Year Offence was Reported

4: REPORT_MONTH : Month Offence was Reported

5: REPORT_DAY : Day of the Month Offence was Reported

6: REPORT_DOY : Day of the Year Offence was Reported

7: REPORT_DOW : Day of the Week Offence was Reported

8: REPORT_HOUR : Hour Offence was Reported

9: DIVISION : Police Division where Offence Occurred

10: LOCATION_TYPE : Location Type of Offence

11: PREMISES_TYPE : Premises Type of Offence

12: UCR_CODE : Uniform Crime Reporting (UCR) Code for Offence

13: HOOD_158 : Identifier of Neighbourhood using City of Toronto's new 158 neighbourhood structure

14: NEIGHBOURHOOD_158 : Name of Neighbourhood using City of Toronto's new 158 neighbourhood structure

15: HOOD_140 : Identifier of Neighbourhood using City of Toronto's old 140 neighbourhood structure

16: NEIGHBOURHOOD_140 : Name of Neighbourhood using City of Toronto's old 140 neighbourhood structure

17: LONG_WGS84 : Longitude Coordinates (Offset to nearest intersection)

18: LAT_WGS84 : Latitude Coordinates (Offset to nearest intersection)

19: TimeofCrime : Time of the Offence/crime

20: temp : Temperature of the location

21: conditions : Weather condition of the location

22: MCI_CATEGORY (Target Column): Type of crime

In the dataset, we have 5 categories categorized by MCI (Model Crime Investigators)

6. Model 1: Training

img_17 img_18

7. Model 1: Testing and Evaluation

Testing and Evaluating the model performance using Classification model's metrics

img_19_part1 img_19_part2

Weighted AUC-ROC & Accuracy are the accuracy metrics used for testing the model here.

8. Model 1: Prediction

here we will pass the input values for-

  1. REPORT_DATE
  2. OCC_DATE
  3. REPORT_YEAR
  4. REPORT_MONTH
  5. REPORT_DAY
  6. REPORT_DOY
  7. REPORT_DOW
  8. REPORT_HOUR
  9. DIVISION
  10. LOCATION_TYPE
  11. PREMISES_TYPE
  12. UCR_CODE
  13. HOOD_158
  14. NEIGHBOURHOOD_158
  15. HOOD_140
  16. NEIGHBOURHOOD_140
  17. LONG_WGS84
  18. LAT_WGS84
  19. TimeofCrime
  20. temp
  21. conditions

And will do the predictions for

MCI_CATEGORY (Target Column): Type of crime.

In the dataset, we have 5 categories categorized by MCI (Model Crime Investigators)

1. Case1

REPORT_DATE_val = '2022-10-02' OCC_DATE_val = '2022-09-29' REPORT_YEAR_val = 2022 REPORT_MONTH_val = 'October' REPORT_DAY_val = 2 REPORT_DOY_val = 275 REPORT_DOW_val = 'Sunday' REPORT_HOUR_val = 7 DIVISION_val = 'D23' LOCATION_TYPE_val = 'Apartment (Rooming House, Condo)' PREMISES_TYPE_val = 'Apartment' UCR_CODE_val = 2120 HOOD_158_val = 1.00 NEIGHBOURHOOD_158_val = 'West Humber-Clairville' HOOD_140_val = 1.00 NEIGHBOURHOOD_140_val = 'West Humber-Clairville (1)' LONG_WGS84_val = '-79.57607714' LAT_WGS84_val = '43.72864324' TimeofCrime_val = '04:00:00' temp_val = 11.80 conditions_val = 'Clear'

img_20

we got the result for case 1 as 'Break and Enter'

img_21

You can change the input to get the another predictions, lets do the predictions for case 2, with changed inputs.

2. Case2

REPORT_DATE_val = '2022-09-29' OCC_DATE_val = '2022-02-19' REPORT_YEAR_val = 2022 REPORT_MONTH_val = 'September' REPORT_DAY_val = 29 REPORT_DOY_val = 272 REPORT_DOW_val = 'Thursday' REPORT_HOUR_val = 23 DIVISION_val = 'D32' LOCATION_TYPE_val = 'Streets, Roads, Highways (Bicycle Path, Privat...' PREMISES_TYPE_val = 'Outside' UCR_CODE_val = 1450 HOOD_158_val = 155.00 NEIGHBOURHOOD_158_val = 'Downsview' HOOD_140_val = 27.00 NEIGHBOURHOOD_140_val = 'York University Heights (27)' LONG_WGS84_val = '-79.46338664' LAT_WGS84_val = '43.75024493' TimeofCrime_val = '04:00:00' temp_val = 11.60 conditions_val = 'Partially cloudy'

img_22

we got the result for case 2 as 'Assault'

img_23

9. Model 2: Time series Model

A time series model is created to predict the frequency of crime that can happen in the future. The algorithm used to achieve the objective is Skforecast. Skforecast is a simple open-source Python package for time series forecasting. It provides a simple and intuitive API to create and fit forecasting models using machine learning algorithms such as ARIMA, SARIMA, exponential smoothing, and random forests.

  1. Under the notebook section. Navigate to the "TimeseriesModel" folder inside your user.
  2. Open the "SkforecastModel.ipynb" file. image

Note: If the computing machine is already available and created in model 1, then please skip steps 3 and 5.

  1. Click on the three dots as shown in the screenshot and create a new Azureml compute instance. ts_new2

    • Write down any name for computing name.
    • Select Virtual machine type as CPU
    • Click on Select from all options
    • Choose any of the given VM
    • Then finally, click on the Create button

image

  1. You can see a computing instance starting up. Wait till it turns in green colour. image

  2. Select the kernal as Python 3 (ipykernel) as shown in the image. image

  3. Click on the "Restart kernal and run all cells" button. This will run all the cells in the notebook. ts_new4

  4. This saves the prediction values in a CSV and shows the trend as below: ts_new5

NOTE: To run the individual cells, please click the cell that you want to run and then press shift+enter (Make sure you run all the cells above that cell as well).

10. Model 2: Export dataset from SQL Database Connection

  1. Code snippet to establish a SQL database connection and export data from a SQL table. image

11. Model 2: Training and Prediction

Model Training: image

Test Results: image

Prediction: Please provide the future date to the date_val variable. The prediction will be saved till the given future date. For instance, in the given screenshot a prediction result is shown for 2 years (from July 23, 2022, to July 23, 2024) as date_val = "2024-07-23".

image