isoverse / isoreader

Read IRMS (Isotope Ratio Mass Spectrometry) data files into R
http://isoreader.isoverse.org
GNU General Public License v2.0
8 stars 6 forks source link

iso_get_raw_data's include_file_info column repeats first value #160

Closed japhir closed 3 years ago

japhir commented 3 years ago

I've been trying to debug this all day :O.

Looks like this is a problem that could do with a simple test:

whenever I include_file_info = Analysis it repeats the first value for all rows:

library(isoreader)
iso_get_raw_data(clumpedr::standards, include_file_info = Analysis)

results in

Info: aggregating raw data from 27 data file(s), including file info 'Analysis'
# A tibble: 2,187 x 11
   file_id Analysis type  cycle v44.mV v45.mV v46.mV v47.mV v48.mV v49.mV v54.mV
   <chr>   <chr>    <chr> <int>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
 1 180814… 4841     stan…     0 16647. 19675. 23072. 25664.  2124.  -268.  -270.
 2 180814… 4841     stan…     1 16406. 19391. 22738. 25293.  2094.  -264.  -265.
 3 180814… 4841     stan…     2 16153. 19091. 22387. 24899.  2062.  -259.  -261.
 4 180814… 4841     stan…     3 15891. 18782. 22024. 24502.  2029.  -255.  -257.
 5 180814… 4841     stan…     4 15631. 18476. 21665. 24100.  1997.  -250.  -252.
 6 180814… 4841     stan…     5 15383. 18182. 21321. 23717.  1964.  -246.  -249.
 7 180814… 4841     stan…     6 15135. 17889. 20978. 23336.  1934.  -241.  -244.
 8 180814… 4841     stan…     7 14892. 17603. 20641. 22959.  1903.  -237.  -240.
 9 180814… 4841     stan…     8 14656. 17323. 20313. 22600.  1873.  -234.  -236.
10 180814… 4841     stan…     9 14422. 17047. 19990. 22234.  1844.  -229.  -232.
# … with 2,177 more rows
sebkopf commented 3 years ago

that is a curious bug, I have never seen it. Can you confirm that the following two statements are indeed different from each other (with the second giving you the same Analysis value although the first gives you the correct Analysis values for each file)?

iso_get_file_info(clumpedr::standards, Analysis) 
# vs.
iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% dplyr::select(file_id, Analysis) %>% unique()
japhir commented 3 years ago

ah whoops, I'm being dumb. Of course it's repeating the file info for each row of raw data… Sorry!

library(isoreader)
#> 
#> Attaching package: 'isoreader'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
devtools::load_all("~/SurfDrive/PhD/programming/clumpedr")
#> ℹ Loading clumpedr

iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% select(file_id, Analysis)
#> Info: aggregating raw data from 27 data file(s), including file info 'Analysis'
#> # A tibble: 2,187 x 2
#>    file_id                   Analysis
#>    <chr>                     <chr>   
#>  1 180814_75_IAM_1_ETH-3.did 4841    
#>  2 180814_75_IAM_1_ETH-3.did 4841    
#>  3 180814_75_IAM_1_ETH-3.did 4841    
#>  4 180814_75_IAM_1_ETH-3.did 4841    
#>  5 180814_75_IAM_1_ETH-3.did 4841    
#>  6 180814_75_IAM_1_ETH-3.did 4841    
#>  7 180814_75_IAM_1_ETH-3.did 4841    
#>  8 180814_75_IAM_1_ETH-3.did 4841    
#>  9 180814_75_IAM_1_ETH-3.did 4841    
#> 10 180814_75_IAM_1_ETH-3.did 4841    
#> # … with 2,177 more rows

iso_get_raw_data(clumpedr::standards, include_file_info = Analysis) %>% distinct(file_id, Analysis)
#> Info: aggregating raw data from 27 data file(s), including file info 'Analysis'
#> # A tibble: 27 x 2
#>    file_id                      Analysis
#>    <chr>                        <chr>   
#>  1 180814_75_IAM_1_ETH-3.did    4841    
#>  2 180814_75_IAM_10_IAEA-C2.did 4850    
#>  3 180814_75_IAM_11_IAEA-C1.did 4851    
#>  4 180814_75_IAM_2_ETH-3.did    4842    
#>  5 180814_75_IAM_3_ETH-1.did    4843    
#>  6 180814_75_IAM_4_ETH-1.did    4844    
#>  7 180814_75_IAM_5_ETH-2.did    4845    
#>  8 180814_75_IAM_6_ETH-2.did    4846    
#>  9 180814_75_IAM_7_ETH-4.did    4847    
#> 10 180814_75_IAM_8_ETH-4.did    4848    
#> # … with 17 more rows

iso_get_file_info(clumpedr::standards) %>% select(file_id, Analysis)
#> Info: aggregating file info from 27 data file(s)
#> # A tibble: 27 x 2
#>    file_id                      Analysis
#>    <chr>                        <chr>   
#>  1 180814_75_IAM_1_ETH-3.did    4841    
#>  2 180814_75_IAM_10_IAEA-C2.did 4850    
#>  3 180814_75_IAM_11_IAEA-C1.did 4851    
#>  4 180814_75_IAM_2_ETH-3.did    4842    
#>  5 180814_75_IAM_3_ETH-1.did    4843    
#>  6 180814_75_IAM_4_ETH-1.did    4844    
#>  7 180814_75_IAM_5_ETH-2.did    4845    
#>  8 180814_75_IAM_6_ETH-2.did    4846    
#>  9 180814_75_IAM_7_ETH-4.did    4847    
#> 10 180814_75_IAM_8_ETH-4.did    4848    
#> # … with 17 more rows

Created on 2021-03-09 by the reprex package (v1.0.0)