ropensci / babette

babette is an R package to work with BEAST2
https://docs.ropensci.org/babette
GNU General Public License v3.0
44 stars 6 forks source link

babette sleeps/freezes/becomes unresponsive #104

Closed AJFeng closed 2 months ago

AJFeng commented 1 year ago

I have hundreds BEAST tree need to run, then I wrote a for loop for the run_beast2 function. However, I found out many times, the program slept if I did not use the computer. I never had the program sleep problem before as I always run programs for many days (no matter R, python, or jar files. If I run the BEAST jar directly, it would not sleep). I tried two computers, and both slept. It did not stop the R program, it just sleep. Once I use the computer again, it will continue to run. Do any one know what the problem is?

richelbilderbeek commented 1 year ago

Hi @AJFeng, thanks for posting this Issue.

Could you supply a minimally reproducible example?

I predict it is not babette that causes the sleep, I predict it is BEAST2 or R that promps for user input. The latter happens on, among others, computer clusters.

Looking forward to the code!

Thanks and cheers, Richel

AJFeng commented 1 year ago

The following is the code I am using:

library(beastier)

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))

file_list<-list.files('./xml_file/') file_list<-rev(file_list) for (file in file_list) { name<-unlist(strsplit(file, '.xml',fixed=TRUE))

if (file.exists(paste0('./BEAST_output/trees/',name,'.trees'))) {next}

output_state_filename <- paste0('./BEAST_output/states/',name,".state")

run_beast2( input_filename = paste0('./xml_file/',file), output_state_filename = output_state_filename, n_threads=16, use_beagle =TRUE, verbose=TRUE )

}

richelbilderbeek commented 1 year ago

Hi @AJFeng, thanks for the reply (and sorry I am late; I am usually more quick),

  1. For me to reproduce the problem, could you also share one of the XML files that causes the problem?
  2. Could you run beastier::is_beast2_input_file(input_filename)?

I predict that the XML file uses a different version than beastier supports, and that -to comfirm this- beastier::is_beast2_input_file(input_filename) returns FALSE. This may cause BEAST2 to give a prompt, hence freezing the process.

Measuring is knowing and thanks for checking babette!

AJFeng commented 1 year ago

Thank you so much for the quick response.

I cannot provide the XML file as it includes the patients' sensitive info inside.

But I think the problem may be due to windows 10‘s cache processing. I tried to use python's loop to call the java of the BEAST, and even though it is not sleeping, it lost records of the log and tree file. For example, I set MCMC chain len as 30M, but if I did not use the computer, some of the saved logs and trees would randomly only have 16M, 20M or 26M records. The program keeps running.

But I tried the python code (R code did not work, I also don't know why, it did not have errors, but no records of log and tree file saved) in Linux system with the same code and did not find those problems.

AJFeng commented 1 year ago

beastier::is_beast2_input_file(paste0('./xml_file/',file_list[1])

return TRUE

richelbilderbeek commented 1 year ago

@AJFeng interesting.

Great to know for sure that the XML file is correct. That removes it from the list.

Still, I'd love to see that XML file with the real genetic data replaced by, for example, 3 dummy sequences. This would help me distinguish between 'freezing' and 'just taking a long time'.

I do know babette checks the input quite carefully and that may take some time.

And I know that using BEAGLE and 16 threads is completely untested. Does babette freeze without using BEAGLE? If that takes too long, remove all sequences but, say, 3. With three sequences, does babette freeze with BEAGLE?

Thanks for helping me get babette better. If this would result in a new unit test, I can guarantee the babette users more :-)

richelbilderbeek commented 2 months ago

This Issue has gone stale. Closing it!