OHDSI / Andromeda

AsynchroNous Disk-based Representation of MassivE DAta: An R package aimed at replacing ff for storing large data objects.
https://ohdsi.github.io/Andromeda/
11 stars 13 forks source link

arrow_S4: Behaviour of for loop inside loadAndromeda() #41

Closed solis9753 closed 2 years ago

solis9753 commented 2 years ago

There is an inexplicable behaviour of the for loop inside loadAndromeda(), in the arrow_S4 branch.

Do note that purrr::map() and lapply() work. Running under: macOS Monterey 12.6

suppressPackageStartupMessages(library(Andromeda))
library(purrr)
andromeda <- Andromeda:::.newAndromeda()
andromeda$mtcars <- mtcars
andromeda$iris <- iris

saveAndromeda(andromeda, "testAndromeda")
loadAndromeda("testAndromeda")
#> Error: IOError: Failed to open local file '/private/var/folders/n5/v3hbbdqs0f554tk29kqb67040000gn/T/Rtmp3r6hAR/file9a7c49a9fdc8/iris/part-0.arrow'. Detail: [errno 2] No such file or directory

andromeda <- Andromeda:::.newAndromeda()
path <- andromeda@path
zip::unzip("testAndromeda", exdir = path)
tableNames <- list.dirs(path, full.names = FALSE, recursive = FALSE)
## tables to read 
tableNames
#> [1] "iris"   "mtcars"

## Trying purrr::map()
purrr::map(.x = tableNames, .f = ~arrow::open_dataset(file.path(path, .x), format = "feather"))
#> [[1]]
#> FileSystemDataset with 1 Feather file
#> Sepal.Length: double
#> Sepal.Width: double
#> Petal.Length: double
#> Petal.Width: double
#> Species: dictionary<values=string, indices=int8>
#> 
#> See $metadata for additional Schema metadata
#> 
#> [[2]]
#> FileSystemDataset with 1 Feather file
#> mpg: double
#> cyl: double
#> disp: double
#> hp: double
#> drat: double
#> wt: double
#> qsec: double
#> vs: double
#> am: double
#> gear: double
#> carb: double
#> 
#> See $metadata for additional Schema metadata

## Trying lapply
lapply(tableNames, function(x) arrow::open_dataset(file.path(path, x), format = "feather") )
#> [[1]]
#> FileSystemDataset with 1 Feather file
#> Sepal.Length: double
#> Sepal.Width: double
#> Petal.Length: double
#> Petal.Width: double
#> Species: dictionary<values=string, indices=int8>
#> 
#> See $metadata for additional Schema metadata
#> 
#> [[2]]
#> FileSystemDataset with 1 Feather file
#> mpg: double
#> cyl: double
#> disp: double
#> hp: double
#> drat: double
#> wt: double
#> qsec: double
#> vs: double
#> am: double
#> gear: double
#> carb: double
#> 
#> See $metadata for additional Schema metadata

##Trying the for loop
for (nm in tableNames) {
  andromeda[[nm]] <- arrow::open_dataset(file.path(path, nm), format = "feather")
  }
#> Error: IOError: Failed to open local file '/private/var/folders/n5/v3hbbdqs0f554tk29kqb67040000gn/T/Rtmp3r6hAR/file9a7c37547b40/iris/part-0.arrow'. Detail: [errno 2] No such file or directory

Created on 2022-11-20 with reprex v2.0.2

ablack3 commented 2 years ago

Thank you for the reprex! I'm also using Mac OS 12.6 and get the same error when I use reprex but I don't get any error when I run this code interactively strangely enough.

ablack3 commented 2 years ago

I think this is fixed with my most recent commit. Will you reinstall and test again?

suppressPackageStartupMessages(library(Andromeda))
andromeda <- andromeda()
andromeda$mtcars <- mtcars
andromeda$iris <- iris

saveAndromeda(andromeda, "testAndromeda")
loadAndromeda("testAndromeda")
#> # Andromeda object
#> # Physical location:  /var/folders/xx/01v98b6546ldnm1rg1_bvk000000gn/T//RtmpiELMPG/file2dfe98e918e
#> 
#> Tables:
#> $iris (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species)
#> $mtcars (mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb)

Created on 2022-11-20 with reprex v2.0.2

solis9753 commented 2 years ago

Fixed indeed with the latest commit.