geco-bern / ingestr

Data ingest for points (given longitude, latitude, and required dates) from large global files or remote data servers and create time series at user-specified temporal resolution.
https://geco-bern.github.io/ingestr
42 stars 21 forks source link

improper use of data.table post-processing #57

Closed khufkens closed 1 year ago

khufkens commented 1 year ago

Improper of data.table constructs mixed with dplyr ones in reading and screening for NA values.

https://github.com/geco-bern/ingestr/blob/82ed6e96b662d244a5453b8b82cbcbc55905a4a3/R/get_obs_bysite_fluxnet.R#L1227

In short, the na.strings parameter should be used when reading data and screening for NA values, rather than a post-hoc processing with an ill suited dplyr routine. Mixing processing workflows (where data.tables has its own data flow) is probably poor form. If speed is of concern readr should be used to stick to the dplyr routine and here as well the na parameter should be used, rather than post-hoc processing.