IhsanKhaliq / valascotraceR

Validates ascotraceR model with data from the field experiments
0 stars 0 forks source link

Ensure that model run summaries are correct #4

Closed adamhsparks closed 2 years ago

adamhsparks commented 2 years ago

Current model run summaries indicate a much lower infection rate than the previous summaries.

@IhsanKhaliq posted this in another issue, this represents the previous version of the summary figure.

The new summary figures show about 1/2 or even less of these values for the disease intensity at the end of season. I need to double check my calculations when summarising.

IhsanKhaliq commented 2 years ago

Another reason for the poor prediction is that the model prediction is only based on a cross section at 10 m (going in a horizontal straight line at middle of the plot) https://github.com/IhsanKhaliq/valascotraceR/blob/4ea15e7a48e986091827454d17ffb9d8ec741316/example_trace_ascoRun.Rmd#L113-L116

And the observation is recorded at 0 m (representing 10 m cross section above) as well 3, 6 and 9 m 0001

0002

adamhsparks commented 2 years ago

Ah. I missed that point. We can sample model run quadrats at those points. I’m in a meeting all AM. We’ll have a go at this in the arvo

Sent from my iPhone

On 10 Nov 2021, at 09:05, IhsanKhaliq @.***> wrote:

 Another reason for the poor prediction is that the model prediction is only based on a cross section at 10 m (going in a horizontal straight line at middle of the plot) https://github.com/IhsanKhaliq/valascotraceR/blob/4ea15e7a48e986091827454d17ffb9d8ec741316/example_trace_ascoRun.Rmd#L113-L116

And the observation is recorded at 0 m (representing 10 m cross section above) as well 3, 6 and 9 m

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

adamhsparks commented 2 years ago

I think I understand what the observed data are. I've removed any quadrats from the model runs that are not observational quadrats. So I didn't keep the whole cross-section, I only selected the quadrats that were observed.

The model (or my summaries of the model) are still showing much lower than the figure above. Any ideas, @IhsanKhaliq or @PaulMelloy? Am I summarising the wrong thing or calculating something the wrong way somewhere? I've pushed up my latest version of the document to create this. It takes a minute or two to run, but not too long.

2panel_fig

IhsanKhaliq commented 2 years ago

These parameters hadn't been provided to the model. https://github.com/IhsanKhaliq/valascotraceR/blob/4ea15e7a48e986091827454d17ffb9d8ec741316/example_trace_ascoRun.Rmd#L79-L83

I added these and the output is slightly better 000005

IhsanKhaliq commented 2 years ago

NW9 , NW6 , NW3 quadrats haven't been sampled

IhsanKhaliq commented 2 years ago

Why filter just last day (i_day==145)? https://github.com/IhsanKhaliq/valascotraceR/blob/229c7e4183cc93a552c68db5b1fd1adfea64947d/example_trace_ascoRun.Rmd#L145-L150

The model prediction is over the growing season, so the whole data set needs to be included as done here. Original filtering code is on the dev branch of valascotraceR, in case cross-checking is needed https://github.com/IhsanKhaliq/valascotraceR/blob/4ea15e7a48e986091827454d17ffb9d8ec741316/example_trace_ascoRun.Rmd#L113

adamhsparks commented 2 years ago

I've copied what's been done for the heatmaps, which was the last day only as I recall. I'll recheck Git commits.

However, that doesn't affect progress over the season. The NW quadrats are part of what I'm missing. Thank you.

adamhsparks commented 2 years ago

Including the NW quadrats still results in the model run summary looking pretty much as above. It's still 1/2 of the values in the figure @IhsanKhaliq has provided.

adamhsparks commented 2 years ago

Why filter just last day (i_day==145)?

https://github.com/IhsanKhaliq/valascotraceR/blob/229c7e4183cc93a552c68db5b1fd1adfea64947d/example_trace_ascoRun.Rmd#L145-L150

The model prediction is over the growing season, so the whole data set needs to be included as done here. Original filtering code is on the dev branch of valascotraceR, in case cross-checking is needed

https://github.com/IhsanKhaliq/valascotraceR/blob/4ea15e7a48e986091827454d17ffb9d8ec741316/example_trace_ascoRun.Rmd#L113

See: https://github.com/IhsanKhaliq/valascotraceR/blob/49a68bad0033adb9572440e084c7cbaf11670db0/example_trace_ascoRun.Rmd#L99

I've duplicated what was done already by selecting the last day.

adamhsparks commented 2 years ago

If it’s summarising then it’s in this code for this example.

tidy_trace() doesn’t summarise anything.

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 11 Nov 2021, at 07:00, IhsanKhaliq @.***> wrote:

 I think we need to be looking at the tidy_trace. The problem seems to be in summarising

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

IhsanKhaliq commented 2 years ago

If parameters were not missing, then I should be getting the same output as yours? This is the output you shared earlier 1 This is what I got by adding those parameters 000005

adamhsparks commented 2 years ago

Hummm. Yup, you’re right. I thought we ran the default values! My bad.

If parameters were not missing, then I should be getting the same output as yours? This is the output you shared earlier 1 This is what I got by adding those parameters 000005

Hummm. Yup, you’re right. I thought we ran the default values! My bad.

adamhsparks commented 2 years ago

Sorry, I meant summarise_trace

Possibly. I’ll look into it today

IhsanKhaliq commented 2 years ago

I am just curious. Why is it important to take the mean of new_gp , exposed_gp etc. in summarise_trace when the model output is total number of growing points, not mean number of growing points? The summary can just provide total numbers rather than mean numbers? https://github.com/IhsanKhaliq/ascotraceR/blob/b81f3d4a743c74f1a54721f75df4ee427cffa747/R/summarise_trace.R#L52-L58

adamhsparks commented 2 years ago

I am just curious. Why is it important to take the mean of new_gp , exposed_gp etc. in summarise_trace when the model output is total number of growing points, not mean number of growing points? The summary can just provide total numbers rather than mean numbers? https://github.com/IhsanKhaliq/ascotraceR/blob/b81f3d4a743c74f1a54721f75df4ee427cffa747/R/summarise_trace.R#L52-L58

This should be an issue in ascotraceR for this function.

adamhsparks commented 2 years ago

Also, I’m on my iPad right now so I’m not completely sure, but I can’t find/don’t recall using summarise_trace() in this document?

IhsanKhaliq commented 2 years ago

No, it's not. There is a tidy_trace() mention though

https://github.com/IhsanKhaliq/valascotraceR/blob/eb86fecc15af4bc0ea55d1610b33dc4ffd269769/example_trace_ascoRun.Rmd#L100

adamhsparks commented 2 years ago

Yes, but they are two very different things so I’m not sure why you’re asking about summarise_trace() here. It’s not related to the issue (or this repository).

tidy_trace() doesn’t do any summarising. It just unlists into a tidy data.frame.

PaulMelloy commented 2 years ago

Hello, I have been looking at this morning and there are a couple of things that I have tried changing with the ascotraceR model to fit the "Proportion of infected Quadrats" simulation to the observed.

I looked into the number of spread events (Billa Billa only), as between September and November there seem to be few spread events that impact the model. September contained almost no rain, and no spread events. and rainfall in October seemed not to cause many additional quadrants to become infected.

IMO four things could be happening here

  1. Our distance parameters in splash_distance and wind_distance are too low and need increasing.
  2. The spores_per_gp_per_wet_hour parameter is too low leading to lower spread and infection as mentioned above.
  3. A rainfall multiplier value is needed.
  4. The average hourly wind distance recorded at the weather station was lower than what the model was originally parameterised with by Art and needs to be adjusted for other regions

Spread distance

First I tried changing the distance parameters in splash_distace.R and wind_distance.R in the ascotraceR package. image

Wind_distance

rainfall_multiplier branch

https://github.com/IhsanKhaliq/ascotraceR/blob/b98a5ed58b306e9911b9fd04791d936ccdf2e57a/R/wind_distance.R#L21-L24

dev

https://github.com/IhsanKhaliq/ascotraceR/blob/118a8079a92de9f284103ac10206148c3c298aa6/R/wind_distance.R#L19-L22

splash_distance

rainfall_multiplier branch

https://github.com/IhsanKhaliq/ascotraceR/blob/b98a5ed58b306e9911b9fd04791d936ccdf2e57a/R/splash_distance.R#L15-L17

dev

https://github.com/IhsanKhaliq/ascotraceR/blob/118a8079a92de9f284103ac10206148c3c298aa6/R/splash_distance.R#L18-L20

Rainfall multiplier

Next (in addition to above) I added a rainfall multiplier based on the sum of all rainfall in a day. image

rainfall_multiplier branch

https://github.com/IhsanKhaliq/ascotraceR/blob/b98a5ed58b306e9911b9fd04791d936ccdf2e57a/R/one_day.R#L84-L85

IhsanKhaliq commented 2 years ago

Any idea, what happened that Adam figure doesn't match the original figure for Billa Billa? We should get below figure. This is something we can't seem to figure out since this AM 138407106-e8aaf9a5-03d0-46f6-9174-edc420d1b71d

PaulMelloy commented 2 years ago

I think from memory that was from the cross-section of the paddock at 10m

adamhsparks commented 2 years ago

I think from memory that was from the cross-section of the paddock at 10m

I've used the same locations in the virtual paddock as the real paddock for these figures that are lower. So if my methods are accurate, we need to see if we can provide more realistic parameter values that get it more inline with what was observed.

IhsanKhaliq commented 2 years ago

Yes, that's right but Adam wasn't getting the same figure with the cross section either (see the first comment at the top). We should now more disease since we are sampling more quadrats

PaulMelloy commented 2 years ago

Yep. That one above was also with the changed wind_distance, splash_distance and rainfall multiplier, as described above. So it would be different to yours @adamhsparks

IhsanKhaliq commented 2 years ago

Just for your information. Parameters values tested and used by Coventry Screen Shot 2021-11-11 at 1 32 12 pm

PaulMelloy commented 2 years ago

And which values were the optimum from the range they tested?

PaulMelloy commented 2 years ago

Also just a thought, the asco strain we have here might have different values due to differences in fecundity and aggressiveness

IhsanKhaliq commented 2 years ago

And which values were the optimum from the range they tested?

See the column Model Parameter Value

IhsanKhaliq commented 2 years ago

Also just a thought, the asco strain we have here might have different values due to differences in fecundity and aggressiveness

Of course.

IhsanKhaliq commented 2 years ago

Also just a thought, the asco strain we have here might have different values due to differences in fecundity and aggressiveness

Of course.

I remember Josh saying that their isolates are more aggressive than ours and wouldn't give me infested stubble

PaulMelloy commented 2 years ago

Ok so the Model parameter value was the median value tested and the optimum?

So when they tested a range of values, based off the Model Parameter Value they always found the Model Parameter Value to be the best?

PaulMelloy commented 2 years ago

Also I wonder if the latent period should be lower. I think I remember Kevin Moore saying the latent period could be as short as 4 days? that would be a latent period of approx 80

IhsanKhaliq commented 2 years ago

Ok so the Model parameter value was the median value tested and the optimum?

So when they tested a range of values, based off the Model Parameter Value they always found the Model Parameter Value to be the best?

They got the value from Art paper and used what was closer to the observation (not median), and the Model Parameter Value was the closest. We can lower the latest period to up to 100 maybe

adamhsparks commented 2 years ago

Yep. That one above was also with the changed wind_distance, splash_distance and rainfall multiplier, as described above. So it would be different to yours @adamhsparks

What changed and where? because now the model fit is worse. Default values? Values in this document? Something in the internal code workings?

PaulMelloy commented 2 years ago

I have

Yep. That one above was also with the changed wind_distance, splash_distance and rainfall multiplier, as described above. So it would be different to yours @adamhsparks

What changed and where? because now the model fit is worse. Default values? Values in this document? Something in the internal code workings?

I have not pushed any changes.

The changes I made to the ascotraceR package locally and have been testing the model runs. However they have been taking a couple of hours to run. I should reduce the number of reps.

Are you asking what changed compared to Art's paper or the Coventry paper?

adamhsparks commented 2 years ago

I have

Yep. That one above was also with the changed wind_distance, splash_distance and rainfall multiplier, as described above. So it would be different to yours @adamhsparks

What changed and where? because now the model fit is worse. Default values? Values in this document? Something in the internal code workings?

I have not pushed any changes.

The changes I made to the ascotraceR package locally and have been testing the model runs. However they have been taking a couple of hours to run. I should reduce the number of reps.

Are you asking what changed compared to Art's paper or the Coventry paper?

I'm asking what changes are you referring to. I'm not familiar enough yet. Is it a different code version or papers or something else? Art's paper presumably doesn't apply since it's for lupin not chickpea, correct?

How many reps are you running? 40 takes only a few minutes in this example. See the timings in the HTML output doc. Otherwise, I've a meeting in an hour to set up RStudio on Pawsey for my use. I can chuck it on there. 😀

PaulMelloy commented 2 years ago

oooo pawsey, nice :)

Perhaps we can define some parameter ranges and try some permutations on this model. See what fits best. I have been configuring a linux server to remote into, but nothing that fast. I still have access to nectar, hoping to get access to the UQ HPC soon

I have only been running 10 - 20 reps. It has been taking hours to run.

IhsanKhaliq commented 2 years ago

As a last suggestion, Art added one infectious_gp growing point in a quadrat when initial infection was observed. Currently, we are just providing initial infection date https://github.com/IhsanKhaliq/ascotraceR/blob/118a8079a92de9f284103ac10206148c3c298aa6/R/trace_asco.R#L96 Two infectious_gp needs to added to Billa Billa and three to Tosari because infection was observed in one central quadrat at Billa Billa and three central quadrats at Tosari. We can introduce a new parameter initial_infection_intensity to do so. This way we will get of the warning too https://github.com/IhsanKhaliq/ascotraceR/issues/117 140834672-da9f49ae-c3df-44c9-9ce7-53035a4f80a5

IhsanKhaliq commented 2 years ago

oooo pawsey, nice :)

Perhaps we can define some parameter ranges and try some permutations on this model. See what fits best. I have been configuring a linux server to remote into, but nothing that fast. I still have access to nectar, hoping to get access to the UQ HPC soon

I have only been running 10 - 20 reps. It has been taking hours to run.

This is because of the rainfall multiplier. I was running the model for three days on my computer for Billa Billa. I think a rainfall mulitplier should be some value (as we have for wind cauchy multiplier) than what we currently have? https://github.com/IhsanKhaliq/ascotraceR/blob/b98a5ed58b306e9911b9fd04791d936ccdf2e57a/R/one_day.R#L84-L85

PaulMelloy commented 2 years ago

I was just looking at the weather data from Tosari and no rain fell after the 15 of August. The model runs to early November. So I doubt there is any way to have the model correlate with the observations. Was the weather data from a weather station on the farm?

IhsanKhaliq commented 2 years ago

I used data from the nearby weather station https://ozforecast.com.au/cgi-bin/weatherstation.cgi?station=11497. Our weather station did not record rainfall data and the weather station on site recorded daily summaries instead of hourly https://www.goannatelemetry.com.au/Login.aspx

IhsanKhaliq commented 2 years ago

I was just looking at the weather data from Tosari and no rain fell after the 15 of August. The model runs to early November. So I doubt there is any way to have the model correlate with the observations. Was the weather data from a weather station on the farm?

Afternoon All

I went up to Tosari and installed a new weather station at the end of the bore road near the north west corner of the sorghum.

To log in you need to download the Goanna Telemetry App from the App store (picture below). This can be sent to the researchers if need be.

Login - QLD Crop Research Password: Tosari

IhsanKhaliq commented 2 years ago

If you want the most accurate rainfall data (133 mm) from the above metioned weather station, then I have provided it as a excel file field_weather_data on master branch of ascotraceR repo. But these are daily summaries instead of hourly/10 minutes data

PaulMelloy commented 2 years ago

It would be worth checking to see if the daily rainfall data deviates from the sub-hourly data from oz harvest

IhsanKhaliq commented 2 years ago

Currently, there is 102 mm rainfall for Tosari. The onsite weather recorded 132 mm rain but they are daily averages. I provided the daily averages but then it was decided in a meeting that we need hourly/10 minutes data, so that was the only option I had. From: Paul Melloy @.> Reply to: IhsanKhaliq/valascotraceR @.> Date: Friday, 12 November 2021 at 7:16 am To: IhsanKhaliq/valascotraceR @.> Cc: Ihsan Khaliq @.>, Mention @.***> Subject: Re: [IhsanKhaliq/valascotraceR] Ensure that model run summaries are correct (Issue #4)

It would be worth checking to see if the daily rainfall data deviates from the sub-hourly data from oz harvest

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/IhsanKhaliq/valascotraceR/issues/4#issuecomment-966630791, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANBRP5JK4WZ3MUDPHUME2ETULQXBDANCNFSM5HWT5RUA.


This email (including any attached files) is confidential and is

for the intended recipient(s) only. If you received this email by

mistake, please, as a courtesy, tell the sender, then delete this

email.

The views and opinions are the originator's and do not necessarily

reflect those of the University of Southern Queensland. Although

all reasonable precautions were taken to ensure that this email

contained no viruses at the time it was sent we accept no

liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider

of education with the Australian Government.

(CRICOS Institution Code QLD 00244B / NSW 02225M, TEQSA PRV12081)

PaulMelloy commented 2 years ago

Yes we need hourly to run the model, however, currently, we are looking for reasons why model run summaries are not as matching observed data

PaulMelloy commented 2 years ago

Thanks for the login to the GoannaAg site provided me with weather data for all of 2020 with 10 minute intervals. Just to confirm did you say this Goanna weather station was 133m from the experiment site?

I am going to update the example file to run with this data

IhsanKhaliq commented 2 years ago

Yes, that would be the most accurate. See below about 10 minutes data

Hi Ihsan

You can only do 10 min intervals daily not monthly.

Hamish

Hamish Johnstone MCA Ag Management Pty Ltd P O Box 1034<x-apple-data-detectors://2/1> GOONDIWINDI QLD 4390<x-apple-data-detectors://2/1> Phone 0746 775 128<tel:0746%20775%20128> Mobile 0428 765 125<tel:+61%20(0)428%20765%20125> @.**@.>

On 29 Jun 2020, at 9:03 pm, Ihsan Khaliq @.**@.>> wrote: Hi Hamish, Thanks, I was trying to work it out on my phone. I can download the data now. Just curious, is it possible to download the monthly weather data that has been record at 10 minutes interval (e.g., attached ‘example’ file)? For monthly data, I can only download a summary (‘monthly report’ file) for each day (not at 10 minutes interval). I can download daily weather data fine i.e., it has been recorded at 10 minutes interval (‘daily report’). We require weather data recorded at 10 minutes intervals, or at an hour duration for the model, not daily summaries. I watched that instructional video, and I guess it might not be possible?

Many thanks, Ihsan

From: Paul Melloy @.> Reply to: IhsanKhaliq/valascotraceR @.> Date: Friday, 12 November 2021 at 9:03 am To: IhsanKhaliq/valascotraceR @.> Cc: Ihsan Khaliq @.>, Mention @.***> Subject: Re: [IhsanKhaliq/valascotraceR] Ensure that model run summaries are correct (Issue #4)

The login to the GoannaAg site provided me with weather data for all of 2020 with 10 minute intervals. Just to confirm did you say this Goanna weather station was 133m from the experiment site?

I am going to update the example file to run with this data

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/IhsanKhaliq/valascotraceR/issues/4#issuecomment-966686017, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANBRP5KVFC2RCP6CG3OEQRTULRDSBANCNFSM5HWT5RUA.


This email (including any attached files) is confidential and is

for the intended recipient(s) only. If you received this email by

mistake, please, as a courtesy, tell the sender, then delete this

email.

The views and opinions are the originator's and do not necessarily

reflect those of the University of Southern Queensland. Although

all reasonable precautions were taken to ensure that this email

contained no viruses at the time it was sent we accept no

liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider

of education with the Australian Government.

(CRICOS Institution Code QLD 00244B / NSW 02225M, TEQSA PRV12081)

PaulMelloy commented 2 years ago

On the GoannaAg weather station site:

IhsanKhaliq commented 2 years ago

I just checked now. This feature was not available before that's awesome