Open ForbiddenDevil opened 4 months ago
Incorporated the suggested changes. PR: https://github.com/Teradata/jupyter-demos/pull/630
Reviewer 1 comments:
In section 1.3
- Warnings should not be ignored. -- This is a standard used across CSAE usecases
In section 3
- After every PMMLPredict results result_lr, resultrf dataframes are converted to pandas df for plotting can be avoided. -- Used tdplot for the same_
In section 4
- Conversion of dataframe for normalizing and plotting can be avoided. -- Used tdplot for the same
- Sklearn package mean_squarederror can be avoided. -- Removed the use of sklean package_
matplotlib.pyplot
can be avoided for plotting.-- Used tdplot for the sameReviewer 1 suggestions:
In section 3
- We can avoid conversion of data and use tdml dataframe for further processing. -- Used tdml wherever possible
In section 4
TD_RegressionEvaluator
method is alternative tdml function to get evaluation score. -- Used this to get the RMSE values- Other normalization can be performed using tdml dataframe. -- Used tdml functions
- teradataml dataframe supports plotting functionality.-- Used tdplot for the same
Reviewer 2 comments:
- In section 2.4, train and test data from csv is loaded into vantage using 2 step process of loading csv data into pandas df and then pandas df into vantage table. _-- Python doesn't allow variable names starting with numbers. Hence OrderedDict cannot be created for this dataframe to be passed as types in read_csv()_
- In section 3, just to print head of result dataframes, teradataml df are converted to pandas df. -- Removed the use of this
- In section 4, matplotlib is used for simple plotting. -- Used tdplot for the same
- Results cells are missing. Plots are not visible. -- Plots are only visible when we run the cell. ReadOnly versions of the notebooks are on a separate branch
Reviewer 1 suggestions:
- In section 2.4, train and test data from csv can be loaded into vantage directly using
read_csv()
of teradataml.-- Python doesn't allow variable names starting with numbers. Hence OrderedDict cannot be created for this dataframe to be passed as types in read_csv()_- In section3, to print head of result
dataframes(result_lr, result_rf)
, directly usehead()
of teradataml. No need to convert it to pandas df. -- Used tdml here- In section 4, teradataml plots can be used instead of matplotlib. -- Used tdplot for the same
- Recompile notebook to generate results and plots. -- Results are not supposed to be saved
Reviewer 1 comments:
matplotlib.pyplot
can be avoided for plotting.Reviewer 1 suggestions:
TD_RegressionEvaluator
method is alternative tdml function to get evaluation score.Reviewer 2 comments:
Reviewer 1 suggestions:
read_csv()
of teradataml.dataframes(result_lr, result_rf)
, directly usehead()
of teradataml. No need to convert it to pandas df.