Delta-Stewardship-Council / swg-21-connectivity

connectivity synthesis subgroup
2 stars 9 forks source link

modeling code #53

Closed catarfish closed 2 years ago

catarfish commented 2 years ago

So pulling from the clean daymet dataset, we found some issues that Shruti said you had worked on before with the leap year. We need to have data for everyday to use our rollapply function, so can you either directly make edits or let me know how to fix it so that we have data for every day in our covariates? I can also talk to lauren et al if they already have a clean covars dataset.

The rest of these edits are made so that we have something to present for monday. we are still using the original datasets for now until we fix that and get liz's clean chlorophyll back up (i think it got deleted) so if it looks okay ayou could just merge this for now so pascale can access it that would be great and we can work on leap years separately. Thanks!!

ryanpeek commented 2 years ago

@catarfish I will try to get to this today and either merge or fix and merge. I'll ping you either way when it's done!

ryanpeek commented 2 years ago

@catarfish so I looked at f_load_gam_data and I think there are two potential issues. Neither has to do with leap years...

Solutions!

I pushed a potential workaround, but not sure it's exactly what you want. You can either:

  1. do your rollapply on the distinct data pieces FIRST, then join everything together
  2. join all data EXCEPT the chl-a piece, make sure you have a distinct series of dates (i.e., everything should equal 8708), and then join with chl-a data (and have the duplicated dates and associated covariate data).

Hope this makes sense! Take a look and let me know what you think.

catarfish commented 2 years ago

Hi Ryan,

Thanks for taking a look. I'm not sure I see your changes. I still don't fully get the pull request options so that might be why, but can't find them.

For the comment regarding join, I'm not sure that is the problem. There are 8708 rows for the covars (daymet_flow_inun_cimis; just an extra NA row). For now we were just using a covars dataset that includes daymet, flow verona, inundation, and cimis. The rollapply are only applied to this dataset. However when I check the lag on the covars between each day, there are 10 rows that have more than a 1-lag day (row 72 ish). All of these entries occur on a 12/31 and 1/1 around a leap year.

[image: image.png] Sorry the code is a bit confusing because I DID do a join with chlorophyll a just to set up for the future and have something to play with, but was waiting on liz's dataset before finalizing it. So it's kind of a placeholder, and I will join covars after doing the rollapply.

Thanks! Cat

On Wed, Jan 19, 2022 at 3:23 PM Ryan Peek @.***> wrote:

@catarfish https://github.com/catarfish so I looked at f_load_gam_data and I think there are two potential issues. Neither has to do with leap years...

  • when joining all the data together (line 66ish), we are adding duplicate dates because we are using full_join here. Which is fine, because we want to keep all the chl-a data, of which there are multiple measurements from the same date (but different stations) in many cases.
  • But what this means is when you try to rollapply over the dates, you are creating a non-uniform moving window in some cases because it's looking at 7 rows as always = to a week...but that isn't always the case because of the duplicate dates.

Solutions!

I pushed a potential workaround, but not sure it's exactly what you want. You can either:

  1. do your rollapply on the distinct data pieces FIRST, then join everything together
  2. join all data EXCEPT the chl-a piece, make sure you have a distinct series of dates (i.e., everything should equal 8708), and then join with chl-a data (and have the duplicated dates and associated covariate data).

Hope this makes sense! Take a look and let me know what you think.

— Reply to this email directly, view it on GitHub https://github.com/Delta-Stewardship-Council/swg-21-connectivity/pull/53#issuecomment-1016962344, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALCPFKQ5VHI6RS7EZSDKH5DUW5BWXANCNFSM5MKMHWJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.*** com>

catarfish commented 2 years ago

Hey Ryan, so I copied and pasted in your changes. Will make a few tweaks afterwards once the chla data are ready. Go ahead and merge in. Thanks for fixing!

ryanpeek commented 2 years ago

looks good @catarfish and I added the chla_dataset in themodel_dataset...it's a little different so will want to double check and replace the data pieces, but it merges all the flow, inundation, and day met data in with the chla data we pulled for yolo.