Based on all available data sets I identified that approximately 6.1% (i.e. 45566 data points) of all GW quality data points need to be fixed. See below for details:
library(wasserportal)
gwq_data <- jsonlite::fromJSON("https://kwb-r.github.io/wasserportal/stations_gwq_data.json")
### Number of GW quality data points available in Wasserportal
nrow(gwq_data)
#> [1] 713955
gwq_data_tobefixed <- gwq_data %>%
dplyr::filter(Messwert == 0,
Einheit == "\u00B5g/l") %>%
dplyr::count(.data$Messstellennummer) %>%
dplyr::arrange(dplyr::desc(.data$n)) %>%
dplyr::rename("n_samples_with_-0.00" = .data$n)
### Number of GW quality stations that need to be fixed
nrow(gwq_data_tobefixed)
#> [1] 185
### Number of data points that need to be fixed
n_tobefixed <- sum(gwq_data_tobefixed$`n_samples_with_-0.00`)
### Percent of data points that need to be fixed
100*n_tobefixed/nrow(gwq_data)
#> [1] 6.382195
knitr::kable(gwq_data_tobefixed,
caption = paste("GW quality monitoring stations_id and number of",
"data points that need to be fixed (i.e. increase number of digits",
"in case of '-0.00'"))
Messstellennummer
n_sampleswith\-0.00
5042
1223
5032
1204
4612
1202
7044
1202
7045
1202
7207
1202
4611
1201
7206
1200
7209
1174
5138
1137
7292
1011
6515
1007
7108
1005
6016
979
15147
823
15001
822
149
779
5366
737
6067
342
4304
292
3354
276
6058
274
6056
273
6023
272
7098
272
7171
272
6014
271
6080
270
7180
270
7295
269
6535
267
7039
267
7229
267
5003
266
5404
266
7027
266
5150
265
6510
265
6548
265
7168
265
5008
264
5049
264
5066
264
7014
264
7172
264
15101
264
5022
263
5039
263
6069
263
5026
260
6010
260
5140
258
6020
258
7215
257
7030
256
5095
253
7109
253
6121
252
5013
251
7173
251
5090
250
6017
250
5074
249
5002
247
7111
247
6038
239
7062
239
15000
237
15156
235
6963
234
7064
234
15049
234
9401
233
5297
231
5058
230
7219
230
7258
230
7291
230
15150
230
1359
229
15065
229
15152
229
4875
228
8964
228
7063
227
7195
226
4233
225
6518
225
7255
224
7257
218
4105
217
7285
217
7290
217
15153
217
4061
216
4521
216
7042
216
7286
216
6533
214
7250
210
4846
209
7136
205
4727
204
6504
204
7259
204
7268
200
5025
195
7137
192
6522
191
6066
188
6534
170
9092
168
5306
160
5130
156
8949
155
5010
151
6084
148
7264
148
17309
132
8957
128
344
122
17306
122
645
119
17303
106
580
101
8950
100
7015
90
5207
89
6028
89
7079
89
7165
88
23750
88
7019
84
7161
84
5097
83
10421
74
17304
74
5009
43
6089
28
6097
28
499
26
5020
26
5076
26
6063
26
5155
25
6026
24
6047
24
6081
24
6113
24
7018
24
7057
24
7078
24
7081
24
7084
24
7132
24
7144
24
7176
24
7210
24
7237
24
7248
24
7298
24
7301
24
282
23
5044
20
5255
20
5005
18
5027
18
5036
18
5040
18
5060
18
5073
18
5075
18
5078
18
5086
18
5351
18
5355
18
6520
18
6511
16
8472
14
3215
13
23703
9
7072
8
7086
8
8469
8
15109
8
GW quality monitoring stations_id and number of data points that need to be fixed (i.e. increase number of digits in case of ‘-0.00’
see e.g. for ID=149
Data below detection limit is indicated with
-
sign. In case detection limit is below-0.01
it is exported by Wasserportal as-0.00
149_wasserqualitaet_all_11_05_2005.csv
Based on all available data sets I identified that approximately 6.1% (i.e. 45566 data points) of all GW quality data points need to be fixed. See below for details:
GW quality monitoring stations_id and number of data points that need to be fixed (i.e. increase number of digits in case of ‘-0.00’
Created on 2022-06-09 by the reprex package (v2.0.0)
Session info
``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.1.2 (2021-11-01) #> os Windows 10 x64 (build 19044) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate German_Germany.1252 #> ctype German_Germany.1252 #> tz Europe/Berlin #> date 2022-06-09 #> pandoc 2.14.0.3 @ C:/Program Files/RStudio/bin/pandoc/ (via rmarkdown) #> #> - Packages ------------------------------------------------------------------- #> package * version date (UTC) lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2) #> cli 3.3.0 2022-04-25 [1] CRAN (R 4.1.3) #> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.1.3) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.3) #> data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.3) #> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.3) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) #> dplyr 1.0.9 2022-04-28 [1] CRAN (R 4.1.3) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.15 2022-02-18 [1] CRAN (R 4.1.3) #> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.1.3) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.2 2022-01-31 [1] CRAN (R 4.1.3) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.3) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.2) #> httr 1.4.3 2022-05-04 [1] CRAN (R 4.1.3) #> jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.1.3) #> knitr 1.39 2022-04-26 [1] CRAN (R 4.1.3) #> kwb.datetime 0.5.0 2022-06-01 [1] Github (kwb-r/kwb.datetime@5f2b2c4) #> kwb.utils 0.13.0 2022-06-08 [1] Github (kwb-r/kwb.utils@6218b79) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3) #> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0) #> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.3) #> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.1.3) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.3) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.3) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.4.1 2021-03-30 [1] CRAN (R 4.1.0) #> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.1.3) #> tidyr 1.2.0 2022-02-01 [1] CRAN (R 4.1.3) #> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.1.3) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.3) #> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.1.3) #> wasserportal * 0.1.0.9000 2022-06-02 [1] local #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3) #> xfun 0.31 2022-05-10 [1] CRAN (R 4.1.3) #> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.1.3) #> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.1.2) #> #> [1] C:/Users/mrustl/Documents/R/win-library/4.1 #> [2] C:/Program Files/R/R-4.1.2/library #> #> ------------------------------------------------------------------------------ ```