TileDB-Inc / TileDB-R

R interface to TileDB: The Modern Database
https://tiledb-inc.github.io/TileDB-R
Other
103 stars 18 forks source link

tiledb_object_walk() column contents has to be swapped #683

Closed cgiachalis closed 8 months ago

cgiachalis commented 8 months ago

In short, in the output dataframe TYPE column holds the uris and URI column the types. Docs states a dataframe with object type, object uri string columns, so the contents has to be swapped.

library(tiledb)

root_uri <- tempdir()
uri <- paste0(root_uri, "\\temp")
fromDataFrame(iris, uri)
res <- tiledb_object_walk(root_uri)

res$URI
 [1] "ARRAY"

res$TYPE
 [1] "file:///C:/Users/<.....>/Temp/RtmpEPaF1U/temp"

The root cause is in libtiledb_object_walk: https://github.com/TileDB-Inc/TileDB-R/blob/d796865475ae79c8d63bb4355d39c65bdcec9604/src/libtiledb.cpp#L4327-L4328

If we swap uris with types we get the desired result:

 res$URI
[1] "file:///C:/Users/<....>/Temp/Rtmpq0MVLA/temp"

 res$TYPE
[1] "ARRAY"

Let me know if you wish to see a PR.

eddelbuettel commented 8 months ago

Another excellent catch.

Let me check with the documentation to see what C++, Python, ... do so that I can align accordingly.

eddelbuettel commented 8 months ago

Not entirely clear from looking at https://docs.tiledb.com/main/how-to/object-management so I think i wil just do a quick clean-up along the lines you suggested.

It'll also look nicer once TYPE is short. Quick before with a standard single-cell data set I have here.

```r > tiledb::tiledb_object_walk("/tmp/tiledb/soco") TYPE URI 1 file:///tmp/tiledb/soco/pbmc3k_processed GROUP 2 file:///tmp/tiledb/soco/pbmc3k_processed/ms GROUP 3 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA GROUP 4 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/X GROUP 5 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/X/data ARRAY 6 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm GROUP 7 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_draw_graph_fr ARRAY 8 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_pca ARRAY 9 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_tsne ARRAY 10 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_umap ARRAY 11 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp GROUP 12 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp/connectivities ARRAY 13 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp/distances ARRAY 14 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns GROUP 15 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph GROUP 16 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph/params GROUP 17 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph/params/random_state ARRAY 18 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain GROUP 19 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params GROUP 20 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params/random_state ARRAY 21 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params/resolution ARRAY 22 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain_colors ARRAY 23 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors GROUP 24 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors/params GROUP 25 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors/params/n_neighbors ARRAY 26 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca GROUP 27 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca/variance ARRAY 28 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca/variance_ratio ARRAY 29 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups GROUP 30 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups/params GROUP 31 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups/params/use_raw ARRAY 32 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/var ARRAY 33 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/varm GROUP 34 file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/varm/PCs ARRAY 35 file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw GROUP 36 file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/X GROUP 37 file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/X/data ARRAY 38 file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/var ARRAY 39 file:///tmp/tiledb/soco/pbmc3k_processed/obs ARRAY 40 file:///tmp/tiledb/soco/subset_100_100 GROUP 41 file:///tmp/tiledb/soco/subset_100_100/ms GROUP 42 file:///tmp/tiledb/soco/subset_100_100/ms/RNA GROUP 43 file:///tmp/tiledb/soco/subset_100_100/ms/RNA/X GROUP 44 file:///tmp/tiledb/soco/subset_100_100/ms/RNA/X/data ARRAY 45 file:///tmp/tiledb/soco/subset_100_100/ms/RNA/var ARRAY 46 file:///tmp/tiledb/soco/subset_100_100/obs ARRAY > ```
cgiachalis commented 8 months ago

Excellent! Thanks for your prompt response.

eddelbuettel commented 8 months ago

After the fix:

```r > tiledb::tiledb_object_walk("/tmp/tiledb/soco") TYPE URI 1 GROUP file:///tmp/tiledb/soco/pbmc3k_processed 2 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms 3 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA 4 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/X 5 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/X/data 6 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm 7 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_draw_graph_fr 8 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_pca 9 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_tsne 10 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsm/X_umap 11 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp 12 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp/connectivities 13 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/obsp/distances 14 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns 15 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph 16 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph/params 17 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/draw_graph/params/random_state 18 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain 19 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params 20 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params/random_state 21 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain/params/resolution 22 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/louvain_colors 23 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors 24 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors/params 25 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/neighbors/params/n_neighbors 26 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca 27 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca/variance 28 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/pca/variance_ratio 29 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups 30 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups/params 31 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/uns/rank_genes_groups/params/use_raw 32 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/var 33 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/varm 34 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/RNA/varm/PCs 35 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw 36 GROUP file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/X 37 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/X/data 38 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/ms/raw/var 39 ARRAY file:///tmp/tiledb/soco/pbmc3k_processed/obs 40 GROUP file:///tmp/tiledb/soco/subset_100_100 41 GROUP file:///tmp/tiledb/soco/subset_100_100/ms 42 GROUP file:///tmp/tiledb/soco/subset_100_100/ms/RNA 43 GROUP file:///tmp/tiledb/soco/subset_100_100/ms/RNA/X 44 ARRAY file:///tmp/tiledb/soco/subset_100_100/ms/RNA/X/data 45 ARRAY file:///tmp/tiledb/soco/subset_100_100/ms/RNA/var 46 ARRAY file:///tmp/tiledb/soco/subset_100_100/obs > ```

Need to now ponder if this warrants an 'interface change' (I lean towards no) and what it may do to tests. Apparently nothing (good !) but I seem to have left the last PR with a missing re-roxygenize run I may wrap into this.

eddelbuettel commented 8 months ago

I will make this two PRs just to be cleaner.

cgiachalis commented 8 months ago

Nice. Is it good time to ask if labels are coming to the R client? :)

cgiachalis commented 8 months ago

Oh..you might want to update the LICENSE year https://github.com/TileDB-Inc/TileDB-R/blob/d796865475ae79c8d63bb4355d39c65bdcec9604/LICENSE#L1

eddelbuettel commented 8 months ago

Wow. Another one.

eddelbuettel commented 8 months ago

Nice. Is it good time to ask if labels are coming to the R client? :)

Err, what labels? Am I behind some API offerings not covered?

cgiachalis commented 8 months ago

https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html#tiledb.ArraySchema

https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html#tiledb.Dim.create_label_schema

eddelbuettel commented 8 months ago

"Interesting"

I don't see it at https://docs.tiledb.com/main/how-to/arrays/creating-arrays/creating-the-array-schema but will poke.

cgiachalis commented 8 months ago

Might be still experimental?

eddelbuettel commented 8 months ago

Apparently, there is a header dimension_label_experimental.h. Will peruse / discuss. Thanks for the heads-up!