richelbilderbeek / pirouette

R package that estimates the error BEAST2 makes from a given phylogeny
GNU General Public License v3.0
3 stars 2 forks source link

Improve speed of pir_plot_from_files #419

Closed richelbilderbeek closed 4 years ago

richelbilderbeek commented 4 years ago

Is your feature request related to a problem? Please describe.

pir_plot_from_files is slow. Too slow.

Describe the solution you'd like

Make it faster.

Describe alternatives you've considered

None.

Additional context

richel@N141CU:~/GitHubs/pirouette/scripts$ Rscript pir_plot_from_files.R > ~/pir_timings.txt
Loading required package: beautier
Loading required package: babette
Loading required package: beastier
Loading required package: mauricer
Loading required package: tracerer
1: 2020-07-06 12:45:24
2: 2020-07-06 12:46:12
richelbilderbeek commented 4 years ago

Screenshot from 2020-07-06 15-55-35 Screenshot from 2020-07-06 15-58-28

richelbilderbeek commented 4 years ago

100 folders in less than 10 seconds:

library(pirouette)
library(testthat)

#super_folder <- "/home/richel/pirouette_example_42/pirouette_example_42/example_42"
#super_folder <- "/media/richel/D2B40C93B40C7BEB/pirouette_examples/pirouette_example_18/example_18"
#super_folder <- "/home/richel/pirouette_example_42/pirouette_example_42/example_42"
super_folder <- "/home/richel/pirouette_example_32/pirouette_example_32/example_32"

folder_names <- list.dirs(
  super_folder
)
folder_names <- folder_names[folder_names != super_folder]
folder_names
expect_true(all(dir.exists(folder_names)))

long_pir_out <- create_long_pir_out_from_folders(folder_names = folder_names)
p <- pir_plot_from_long(long_pir_out)
p + ggplot2::ggtitle("") + ggplot2::ggsave("~/example_42.png", width = 7, height = 7)
p
richelbilderbeek commented 4 years ago

Glory to :clap::clap::clap: dplyr::bind_rows :clap::clap::clap:

richelbilderbeek commented 4 years ago

Timing it:

library(pirouette)
library(testthat)

#super_folder <- "/home/richel/pirouette_example_42/pirouette_example_42/example_42"
#super_folder <- "/media/richel/D2B40C93B40C7BEB/pirouette_examples/pirouette_example_18/example_18"
#super_folder <- "/home/richel/pirouette_example_42/pirouette_example_42/example_42"
super_folder <- "/home/richel/pirouette_example_32/pirouette_example_32/example_32"

folder_names <- list.dirs(
  super_folder
)
folder_names <- folder_names[folder_names != super_folder]
folder_names
expect_true(all(dir.exists(folder_names)))

Sys.time()
long_pir_out <- create_tree_and_model_errors_from_folders(
  folder_names = folder_names
)
p <- pir_plot_from_long(long_pir_out)
p + ggplot2::ggtitle("") + ggplot2::ggsave("~/example_42.png", width = 7, height = 7)
p
Sys.time()

results in:

> Sys.time()
[1] "2020-07-14 12:50:19 CEST"
> long_pir_out <- create_tree_and_model_errors_from_folders(
+   folder_names = folder_names
+ )
> p <- pir_plot_from_long(long_pir_out)
> p + ggplot2::ggtitle("") + ggplot2::ggsave("~/example_42.png", width = 7, height = 7)
> p
> Sys.time()
[1] "2020-07-14 12:50:41 CEST"

That is 22 seconds!