informatics-lab / cssp_china_wp5.6_binder

0 stars 0 forks source link

R code #10

Closed nathan962 closed 1 year ago

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:23:55Z ----------------------------------------------------------------

You've reused and overwritten the variable climate_indices here. In keeping with the previous cell, climate_indices_fname would make sense


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:23:56Z ----------------------------------------------------------------

Looks good but I think it would be more readable if the for-loop in this cell was made into a function (possibly 2 - one for feature selection, the other for plotting). Then for each of the provinces you could separately call the function(s) in separate cells and thus be easier to read than 12 plots in a row.


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:23:57Z ----------------------------------------------------------------

Line #14.        din_temp  = data.frame(cbind(temp_x_train, temp_y_train)) # TO DO why do we combine these 

Regarding the TO DO, I suspect that the Boruta function requires all the data that it is analysing exists in one dataframe


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:23:58Z ----------------------------------------------------------------

Line #24.        boruta_output_precip <- Boruta(precip_y_train ~ ., data=din_precip, doTrace=0, maxRuns= 400, ntree=4000, pValue = 0.01) # TO DO why is this formatted like this 

Following from the comment above, it looks like all the data being analysed by Boruta needs to be in one dataframe, even though you are passing it precip_y_train. Did you test this function using data=precip_x_train instead?

Having said that, might it be more correct to pass the column names of precip_y_train rather than the dataframe itself, since the data is also included in din_precip?


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:23:59Z ----------------------------------------------------------------

The legend overlaps with the data points on these plots, which is a bit confusing to read.


review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

kaedonkers commented on 2023-01-25T18:24:00Z ----------------------------------------------------------------

Great plots - could do with a legend/colourbar for what the colours mean


nathan962 commented on 2023-01-26T13:54:53Z ----------------------------------------------------------------

Had a good hour at this and no idea how unfortunately. Any ideas? They're correlations so light colours = close to 0 / not correlated and dark colours mean more heavily correlated/ closer to 1

kaedonkers commented on 2023-01-26T14:34:09Z ----------------------------------------------------------------

Classic R... Have you tried a ggplot2 alternative to heatmap? Generally easier to add elements with ggplot

nathan962 commented on 2023-01-26T15:07:34Z ----------------------------------------------------------------

Had a go with ggplot heatmap.2 today but didn't seem to be what i wanted