IhsanKhaliq / valascotraceR

Validates ascotraceR model with data from the field experiments
0 stars 0 forks source link

Proportion of infected plants vs proportion of infected quadrats? #5

Closed IhsanKhaliq closed 2 years ago

IhsanKhaliq commented 2 years ago

Is it worth trying to use proportion of infected plants instead of proportion of infected quadrats to see if we get better prediction? Both Art and Coventry used the former option. Screen Shot 2021-11-11 at 1 49 43 pm

Screen Shot 2021-11-11 at 1 50 22 pm

Screen Shot 2021-11-11 at 1 50 57 pm .

adamhsparks commented 2 years ago

That's more or less what's been done here. I took the proportion of infected growing points vs all growing points not proportions of quadrats.

IhsanKhaliq commented 2 years ago

Well we need proportion of infected quadrats or plants because we don't have proportion of infected growing points for observation. But I'm going through code and I'm not sure if we're following the same method to compare observation and prediction. For prediction, we're computing proportion of quadrats with infection

https://github.com/IhsanKhaliq/valascotraceR/blob/f021bb884adc16a5f7f93cb07b45319e1e008c4d/example_trace_ascoRun.Rmd#L183-L188

For observation, we're computing the mean of V1 column, and the column shows presence/absence of disease https://github.com/IhsanKhaliq/valascotraceR/blob/f021bb884adc16a5f7f93cb07b45319e1e008c4d/example_trace_ascoRun.Rmd#L206-L210

Why are we computing proportion for predictions and mean for observation? We should be comparing proportion vs. proportion, not proportion vs. mean? Screen Shot 2021-11-15 at 3 43 22 pm

adamhsparks commented 2 years ago

I'll have to recheck, but if it worked as I intended, it should not be the proportion of infected quadrats out of those inspected but rather the proportion of infected growing points in the inspected quadrats.

IhsanKhaliq commented 2 years ago

You can do that for predictions but there is no way to compare that with the observation because we don't have infected growing points for observation. To compare with observation/disease curves, it should be either proportion of infected quadrats or infected plants

adamhsparks commented 2 years ago

OK, I misunderstood what you wanted in an earlier conversation. Previously, I was calculating the proportion of infected quadrats but you indicated that wasn't what was needed here, so I changed it to this.

Please make the appropriate changes as needed.

IhsanKhaliq commented 2 years ago

Ok we just need to replace mean(v1) to sum(V1==1)/n())

adamhsparks commented 2 years ago

Isn't that still the average?

IhsanKhaliq commented 2 years ago

Isn't that still the average?

https://stackoverflow.com/questions/34792323/calculate-percentages-of-a-binary-variable-by-another-variable-in-r

adamhsparks commented 2 years ago

Ooops. I'm doing 50 things at once and misread. :) All good.

adamhsparks commented 2 years ago

I've replaced mean(v1) with sum(V1==1)/n()). No major changes in the figures or fit that I can see.

IhsanKhaliq commented 2 years ago

I saw a very big difference earlier (made it worst). I will need to re check my code tomorrow


From: Adam H. Sparks @.> Sent: Monday, November 15, 2021 7:47:45 PM To: IhsanKhaliq/valascotraceR @.> Cc: Ihsan Khaliq @.>; Author @.> Subject: Re: [IhsanKhaliq/valascotraceR] Proportion of infected plants vs proportion of infected quadrats? (Issue #5)

I've replaced mean(v1) with sum(V1==1)/n()). No major changes in the figures or fit that I can see.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/IhsanKhaliq/valascotraceR/issues/5#issuecomment-968712950, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANBRP5PJALHCE4QZ7YVFYRTUMDJMDANCNFSM5HZP45PQ.


This email (including any attached files) is confidential and is

for the intended recipient(s) only. If you received this email by

mistake, please, as a courtesy, tell the sender, then delete this

email.

The views and opinions are the originator's and do not necessarily

reflect those of the University of Southern Queensland. Although

all reasonable precautions were taken to ensure that this email

contained no viruses at the time it was sent we accept no

liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider

of education with the Australian Government.

(CRICOS Institution Code QLD 00244B / NSW 02225M, TEQSA PRV12081)

adamhsparks commented 2 years ago

Maybe it is worse. It's not better. They look more like what I'd initially had before I selected only the observed quadrats.

adamhsparks commented 2 years ago

OK, I think that this needs to be revised if you want to compare quadrats and not the level of infection in the quadrats.

billa <-
  billa %>%
  filter(quadrat != "other") %>%
  mutate(infected = if_else(infectious_gp > 0, TRUE, FALSE, NA)) %>%
  group_by(i_date) %>%
  summarise(p_inf = sum(infected) / n())

This is comparing the number of growing points infected. Not the per cent of quadrats that are infected.

We're getting there. This is what I get when I try to do 50 things at once. 😅

IhsanKhaliq commented 2 years ago

I see what you mean. We need to create a new column infected in which quadrats with infectious_gp > 0 should be 1 and quadrats with infectious_gp <=0 should be zero. Then we can do sum(infected==1/n() to calculate the proportion of infected quadrats?

adamhsparks commented 2 years ago

Yes, exactly, I poked at it a little bit last night but was too tired to make my brain work properly.

IhsanKhaliq commented 2 years ago

Yes, exactly, I poked at it a little bit last night but was too tired to make my brain work properly.

Almost there Adam

IhsanKhaliq commented 2 years ago

I've made the change, but proportion of infected quadrats is the same. I think it's more or less the same thing, unless I'm doing something wrong. This vignette requires some code review because V1 was not showing quadrats with presence/presence of disease as I thought. There might be some other discrepancies Screen Shot 2021-11-16 at 1 33 32 pm 3

adamhsparks commented 2 years ago

Yes, I've been struggling with exactly this too, @IhsanKhaliq. Glad I'm not alone.

I'll see if I can create a clean script to import and summarise infected paddocks only and go from there and see if that helps me figure this out.

adamhsparks commented 2 years ago

Proportion of infected quadrats is now calculated and shown in the graph for modelled and observed data.

https://github.com/IhsanKhaliq/valascotraceR/commit/ab33d405f73fa9a6c1b2fff4c0acae375b131647

IhsanKhaliq commented 2 years ago

Awesome. Almost there. Not sure why heat map doesn't show disease spread though. . Data shows that 84 and 92% quadarats are infected in Billa and Tosari, respectively but heat map shows maybe 5% quadrat infection. Is it because we've taken mean instead of sum?

Screen Shot 2021-11-16 at 8 18 49 pm

000003

IhsanKhaliq commented 2 years ago

We may have to slightly tweak the spore_rate/primary inoculum intensity because the model slightly over predicted disease at mid season and slightly over predicted disease at the end of the season. But we are very close 000003

adamhsparks commented 2 years ago

Heat map shows mean per quadrat, so an average of 1 infected gp out of 20 runs in a quadrat is pretty small compared to the values at the focus. We can convert it to a binary infected/not or bin it into categories if you want so see the quadrat version of the line graph.

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 16 Nov 2021, at 18:20, IhsanKhaliq @.***> wrote:

 Awesome. Almost there. Not sure why heat map doesn't show disease spread though. . Data shows that 84 and 92 quadarats are infected in Billa and Tosari, respectively but heat map shows maybe 5% quadrat infection. Is it because we've taken mean instead of sum?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

IhsanKhaliq commented 2 years ago

I think the line version will good because it is supposed to show spatial spread. The number of lesions per quadrat are already provided by summarise_trace?

adamhsparks commented 2 years ago

Using lapply() with summarise_trace() would give you total paddock daily values for the major growing point values that you’d then need to average for the 20 runs. It’s not been done here though.

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 16 Nov 2021, at 18:28, IhsanKhaliq @.***> wrote:

 I think the line version will good because it is supposed to show spatial spread. The number of lesions per quadrat are already provided by summarise_trace?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

IhsanKhaliq commented 2 years ago

For the resource announcement paper, we can get away with some sort of table ( like hagis) if we want

But for the model paper, we'd ideally need heatmap comparing obs vs. prediction.

Art heatmap Screen Shot 2021-11-16 at 8 30 30 pm

Coventry heatmap Screen Shot 2021-11-16 at 8 32 37 pm

IhsanKhaliq commented 2 years ago

Or the line graph is also fine if we dont heatmap for the model paper. Can summarise_trace output be exported as a table for the package paper?

adamhsparks commented 2 years ago

Yes but you’d have 120 lines or however many rows there are for i_day

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 16 Nov 2021, at 18:39, IhsanKhaliq @.***> wrote:

 Or the line graph is also fine if we dont heatmap for the model paper. Can summarise_trace output be exported as a table for the package paper?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

IhsanKhaliq commented 2 years ago

How you presented the output in your previously published paper, specifically resource announcement paper? Can we show first 10 days? Or something along those lines?

images_medium_mpmi-07-19-0180-a_t1

adamhsparks commented 2 years ago

I guess you could. That table is the whole table in that announcement though.

I’d be showing the fit and agreement of the model and the observations though, the last figure. You want people to be convinced that it’s good enough to use it.

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 16 Nov 2021, at 18:44, IhsanKhaliq @.***> wrote:

 How you presented the output in your previously published paper, specifically resource announcement paper? Can we show first 10 days? Or something along those lines?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

IhsanKhaliq commented 2 years ago

Let me search some relevant papers tomorrow then we can decide on this


From: Adam H. Sparks @.> Sent: Tuesday, November 16, 2021 8:47:20 PM To: IhsanKhaliq/valascotraceR @.> Cc: Ihsan Khaliq @.>; Mention @.> Subject: Re: [IhsanKhaliq/valascotraceR] Proportion of infected plants vs proportion of infected quadrats? (Issue #5)

I guess you could. That table is the whole table in that announcement though.

I’d be showing the fit and agreement of the model and the observations though, the last figure. You want people to be convinced that it’s good enough to use it.

-- Dr Adam Sparks | Senior Research Scientist Farming Systems Innovation Primary Industries Development Department of Primary Industries and Regional Development 3 Baron-Hay Court, South Perth WA 6151 t +61 (0)8 9368 3689 | m +61 (0)415 489 422 | w dpird.wa.gov.au

On 16 Nov 2021, at 18:44, IhsanKhaliq @.***> wrote:

 How you presented the output in your previously published paper, specifically resource announcement paper? Can we show first 10 days? Or something along those lines?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/IhsanKhaliq/valascotraceR/issues/5#issuecomment-970147216, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANBRP5LPZIN6MWO2E3FYWMLUMIZDRANCNFSM5HZP45PQ.


This email (including any attached files) is confidential and is

for the intended recipient(s) only. If you received this email by

mistake, please, as a courtesy, tell the sender, then delete this

email.

The views and opinions are the originator's and do not necessarily

reflect those of the University of Southern Queensland. Although

all reasonable precautions were taken to ensure that this email

contained no viruses at the time it was sent we accept no

liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider

of education with the Australian Government.

(CRICOS Institution Code QLD 00244B / NSW 02225M, TEQSA PRV12081)

IhsanKhaliq commented 2 years ago

Closed ab33d405f73fa9a6c1b2fff4c0acae375b131647