Closed fredericpiesschaert closed 5 years ago
@fredericpiesschaert, I assume by ID you mean the key field from the database, not just the sequence ID from the input vector?
Suppose you have the following dataframe df
taken from the database:
df <- data.frame(ID = letters[1:10], pressure = c(1033:1041, 900))
print(df, row.names = FALSE)
#> ID pressure
#> a 1033
#> b 1034
#> c 1035
#> d 1036
#> e 1037
#> f 1038
#> g 1039
#> h 1040
#> i 1041
#> j 900
Then you can add the outlier
column like this:
df$outlier <- gwloggeR::detect_outliers(df$pressure)
print(df, row.names = FALSE)
#> ID pressure outlier
#> a 1033 FALSE
#> b 1034 FALSE
#> c 1035 FALSE
#> d 1036 FALSE
#> e 1037 FALSE
#> f 1038 FALSE
#> g 1039 FALSE
#> h 1040 FALSE
#> i 1041 FALSE
#> j 900 TRUE
Now you have the ID
and the outlier
field in one dataframe, which can be used for updating the DB. Does this solve the problem?
I could add an extra argument key
to detect_outliers()
, but this doesn't seem to be essential for the function, hence I rather not to.
What do you think?
@Jo-Loos could this do the trick?
works like a charm
@DavorJ ID's are currently not included in the output vector, you just get a true/false sequence following the order of the inputfile: [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE This is OK when you work with input-files, but when generating the input directly from the DB - as Jo is implementing it - this becomes fishy, because records can be deleted/added in the meantime. Hence, adding ID to the output is quite essential.