AquaSat / AquaMatch_harmonize_WQP
MIT License
1 stars 3 forks source link

Change time handling, undo .gitignore of chapters #97

Closed mbrousil closed 4 months ago

mbrousil commented 4 months ago

Hey all,

The goal with this PR is to fix issues with time handling, update the bookdown for the time methods and to respond to a request from Jack, and try to fix the issue with images not loading on the bookdown site. @steeleb if you are able to do a thorough review of the time-related changes that would be super helpful, though I know ASLO is next week so just lmk when you're able to do!

  1. Time handling: I realized that the ActivityStartDateTime column was a product of dataRetrieval and was in fact in UTC rather than local time. So I've updated fill_date_time() in 3_harmonize/src/clean_wqp_data.R in light of that. It should now produce harmonized_local_time, harmonized_tz, and harmonized_utc columns. Note that harmonized_local_time is character format, not datetime, because only a single tz is allowed per column in R. About 25% of our harmonized_utc times should be 1 hour off of ActivityStartDateTime; the vast majority of these are in the same direction. (See bottom for reprex). Ultimately it was more straightforward to handle DST inconsistencies in the data by allowing {lubridate} to apply DST based on location + date rather than to use time zone abbreviations in the dataset. This is in part because time zone strings like "CST" produce errors and location-based ones like "America/Chicago" don't. Also, I've added in the ActivityStartTime.Time column to the aggregated output in this version. It seemed best to include this with the rest of the date/time info for full usability.
  2. Bookdown has been edited to explain the above time changes and to give a quick explanation of what AquaSat v2 is
  3. I think the root of the issue with the bookdown site not loading images is that the docs/chapters/ folder was not being tracked on GitHub. The old name of this folder, _book/chapters was ignored and I think that carried over. So all of those files are now tracked as part of this PR

I have the current output uploaded to Drive if you need it! Let me know if there's any more info, etc. that you all need to review this. Thanks!



p3_chla_agg_harmonized_feather %>%
  mutate(utc_diff = as.numeric(ymd_hms(ActivityStartDateTime) - harmonized_utc)) %>%
  ggplot() +
  geom_histogram(aes(utc_diff / 60^2)) +
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `utc_diff = as.numeric(ymd_hms(ActivityStartDateTime) -
#>   harmonized_utc)`.
#> Caused by warning:
#> !  2420 failed to parse.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Removed 404362 rows containing non-finite outside the scale range
#> (`stat_bin()`).

p3_chla_agg_harmonized_feather %>%
  mutate(utc_diff = as.numeric(ymd_hms(ActivityStartDateTime) - harmonized_utc)) %>%
  filter(utc_diff > 0) %>%
  select(harmonized_local_time, ActivityStartDateTime, harmonized_utc) %>%
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `utc_diff = as.numeric(ymd_hms(ActivityStartDateTime) -
#>   harmonized_utc)`.
#> Caused by warning:
#> !  2420 failed to parse.
#> # A tibble: 10 × 3
#>    harmonized_local_time   ActivityStartDateTime harmonized_utc     
#>    <chr>                   <dttm>                <dttm>             
#>  1 2004-07-13 10:10:00 EDT 2004-07-13 15:10:00   2004-07-13 14:10:00
#>  2 2004-08-10 09:50:00 EDT 2004-08-10 14:50:00   2004-08-10 13:50:00
#>  3 2004-09-14 10:10:00 EDT 2004-09-14 15:10:00   2004-09-14 14:10:00
#>  4 2004-07-13 11:20:00 EDT 2004-07-13 16:20:00   2004-07-13 15:20:00
#>  5 2004-08-10 11:00:00 EDT 2004-08-10 16:00:00   2004-08-10 15:00:00
#>  6 2004-08-30 10:30:00 EDT 2004-08-30 15:30:00   2004-08-30 14:30:00
#>  7 2004-09-08 11:45:00 EDT 2004-09-08 16:45:00   2004-09-08 15:45:00
#>  8 2004-09-09 07:25:00 EDT 2004-09-09 12:25:00   2004-09-09 11:25:00
#>  9 2004-09-14 11:25:00 EDT 2004-09-14 16:25:00   2004-09-14 15:25:00
#> 10 2004-09-18 05:20:00 EDT 2004-09-18 10:20:00   2004-09-18 09:20:00

Created on 2024-05-30 with reprex v2.1.0

mbrousil commented 4 months ago

Merging and opening new PR with new changes