Closed japhir closed 3 years ago
that is a curious bug, I have never seen it. Can you confirm that the following two statements are indeed different from each other (with the second giving you the same Analysis
value although the first gives you the correct Analysis
values for each file)?
iso_get_file_info(clumpedr::standards, Analysis)
# vs.
iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% dplyr::select(file_id, Analysis) %>% unique()
ah whoops, I'm being dumb. Of course it's repeating the file info for each row of raw data… Sorry!
library(isoreader)
#>
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#>
#> filter
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
devtools::load_all("~/SurfDrive/PhD/programming/clumpedr")
#> ℹ Loading clumpedr
iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% select(file_id, Analysis)
#> Info: aggregating raw data from 27 data file(s), including file info 'Analysis'
#> # A tibble: 2,187 x 2
#> file_id Analysis
#> <chr> <chr>
#> 1 180814_75_IAM_1_ETH-3.did 4841
#> 2 180814_75_IAM_1_ETH-3.did 4841
#> 3 180814_75_IAM_1_ETH-3.did 4841
#> 4 180814_75_IAM_1_ETH-3.did 4841
#> 5 180814_75_IAM_1_ETH-3.did 4841
#> 6 180814_75_IAM_1_ETH-3.did 4841
#> 7 180814_75_IAM_1_ETH-3.did 4841
#> 8 180814_75_IAM_1_ETH-3.did 4841
#> 9 180814_75_IAM_1_ETH-3.did 4841
#> 10 180814_75_IAM_1_ETH-3.did 4841
#> # … with 2,177 more rows
iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% distinct(file_id, Analysis)
#> Info: aggregating raw data from 27 data file(s), including file info 'Analysis'
#> # A tibble: 27 x 2
#> file_id Analysis
#> <chr> <chr>
#> 1 180814_75_IAM_1_ETH-3.did 4841
#> 2 180814_75_IAM_10_IAEA-C2.did 4850
#> 3 180814_75_IAM_11_IAEA-C1.did 4851
#> 4 180814_75_IAM_2_ETH-3.did 4842
#> 5 180814_75_IAM_3_ETH-1.did 4843
#> 6 180814_75_IAM_4_ETH-1.did 4844
#> 7 180814_75_IAM_5_ETH-2.did 4845
#> 8 180814_75_IAM_6_ETH-2.did 4846
#> 9 180814_75_IAM_7_ETH-4.did 4847
#> 10 180814_75_IAM_8_ETH-4.did 4848
#> # … with 17 more rows
iso_get_file_info(clumpedr::standards) %>% select(file_id, Analysis)
#> Info: aggregating file info from 27 data file(s)
#> # A tibble: 27 x 2
#> file_id Analysis
#> <chr> <chr>
#> 1 180814_75_IAM_1_ETH-3.did 4841
#> 2 180814_75_IAM_10_IAEA-C2.did 4850
#> 3 180814_75_IAM_11_IAEA-C1.did 4851
#> 4 180814_75_IAM_2_ETH-3.did 4842
#> 5 180814_75_IAM_3_ETH-1.did 4843
#> 6 180814_75_IAM_4_ETH-1.did 4844
#> 7 180814_75_IAM_5_ETH-2.did 4845
#> 8 180814_75_IAM_6_ETH-2.did 4846
#> 9 180814_75_IAM_7_ETH-4.did 4847
#> 10 180814_75_IAM_8_ETH-4.did 4848
#> # … with 17 more rows
Created on 2021-03-09 by the reprex package (v1.0.0)
I've been trying to debug this all day :O.
Looks like this is a problem that could do with a simple test:
whenever I
include_file_info = Analysis
it repeats the first value for all rows:results in