irods / irods_client_library_rirods

rirods R Package
https://rirods.irods4r.org
Other
6 stars 5 forks source link

Create an own command for metadata listing #12

Open chStaiger opened 1 year ago

chStaiger commented 1 year ago

If a data object or collection carries some more metadata items, the ils command output becomes a bit too cluttered.

> ils(path="/bobZone/home/christine/test/foo", metadata = TRUE)
                      logical_path
1 /bobZone/home/christine/test/foo
                                                                                       metadata
1 foo, key1, key2, key3, key4, key5, bar, value1, value2, value3, value4, value5, baz, , , , , 
         type
1 data_object
MartinSchobben commented 1 year ago

There would the possibility to make an rirods S3 class and to extend the generic print() to make it look prettier. Maybe something for later on.

trel commented 1 year ago

this was observed and noted during trirods dec 2022 as well.

a custom/extended print() seems useful and good.

montesmariana commented 1 year ago

I think a good option or inspiration source is tidyverse nested tibbles. This can already be implemented by using tibble::tibble: the output is quite readable and users familiar with tidyverse would know how to manipulate them with tidyr::unnest(), for example.

library(rirods)
library(tibble)
create_irods("http://localhost/irods-rest/0.9.3", "/tempZone/home", overwrite = TRUE)
iauth('rods', 'rods')
files <- ils(metadata = TRUE)
files
#>                     logical_path                          metadata        type
#> 1 /tempZone/home/rods/collection attr1, attr2, val1, val2, unit1,   collection
#> 2    /tempZone/home/rods/foo.rds                     foo, bar, baz data_object
files$metadata
#> [[1]]
#>   attribute value units
#> 1     attr1  val1 unit1
#> 2     attr2  val2      
#> 
#> [[2]]
#>   attribute value units
#> 1       foo   bar   baz
files$metadata <- Map(as_tibble, files$metadata)
as_tibble(files)
#> # A tibble: 2 × 3
#>   logical_path                   metadata         type       
#>   <chr>                          <list>           <chr>      
#> 1 /tempZone/home/rods/collection <tibble [2 × 3]> collection 
#> 2 /tempZone/home/rods/foo.rds    <tibble [1 × 3]> data_object
files$metadata
#> [[1]]
#> # A tibble: 2 × 3
#>   attribute value units  
#>   <chr>     <chr> <chr>  
#> 1 attr1     val1  "unit1"
#> 2 attr2     val2  ""     
#> 
#> [[2]]
#> # A tibble: 1 × 3
#>   attribute value units
#>   <chr>     <chr> <chr>
#> 1 foo       bar   baz
files[1, 'metadata']
#> [[1]]
#> # A tibble: 2 × 3
#>   attribute value units  
#>   <chr>     <chr> <chr>  
#> 1 attr1     val1  "unit1"
#> 2 attr2     val2  ""

Created on 2023-03-17 with reprex v2.0.2

One weird thing is how the character values are printed when one of the values in the column is empty (e.g. "units" in this example). The data.frame print() method just shows the cell empty, whereas tibble() shows the quotation marks and adds them to other cells in the column. If the empty string is turned to NA_character_, the quotation marks disappear from "unit1".

It might be worth looking into how tibble renders this printing (at least the summary of the nested dataframe/tibble) and imitate it, if using tibble as a dependency is not worth it. I couldn't find that immediately.