bodkan / slendr

Population genetic simulations in R 🌍
https://bodkan.net/slendr
Other
54 stars 5 forks source link

trouble with ts_load(): Error in times * model$generation_time : non-numeric argument to binary operator #86

Closed ornobalam closed 2 years ago

ornobalam commented 2 years ago

Hi!

Congrats again on a really nice tool. I ran setup_env() and got the message that the python environment has been successfully created. msprime runs fine, but I get the following error when I try to run ts_load:

Error in times * model$generation_time : non-numeric argument to binary operator

Would appreciate any suggestions to get past this!

Best, Ornob

bodkan commented 2 years ago

Hello! Thanks for checking out slendr! And sorry you've ran into a problem this early. 😅 Beta version software is the worst. 🙂

(At least you've pulled through the setup_env() Python configuration, which is the biggest minefield for this R package and one that gives me the worst nightmares, being primarily an R developer.)

Could you please try to narrow down the problem to the smallest possible R script? Something I could run with the latest slendr version and see if I can reproduce the problem? The "smallest possible" is important because this will help me narrow down the issue much faster.

The error suggests that something is off with the internal code that converts time units from tskit "generations backwards in time" to slendr's "whatever time units are used by the user's model". Clearly a slendr bug I need to fix. There are a LOT of moving pieces involved in this conversion, so I'm honestly not that surprised things are not working as smoothly as they should (for now).

Thank you for your help and for your patience!

ornobalam commented 2 years ago

Thanks for the reply! Yeah, completely understandable that this package would have a lot of moving parts. I was playing along with some of the tutorial code. I've attached a much simplified sample that still produces the error.

slendr_trial.R.zip .

bodkan commented 2 years ago

Thanks for the quick reply!

I have good news and bad news.

Good news is, that I managed to run the script you've provided without error, including the ts_load call. I also tested both slim() and msprime() back ends and managed to load the output tree sequence from both.

Bad news is that right now that doesn't help much because you still got a different result.

That said, given that the script works on my configuration, maybe it's some bug that manifested in an earlier version of slendr but was fixed in a more recent version?

Just to establish some common baseline, could you try running devtools::install_github("bodkan/slendr") again, to make sure you have the exact same version of the package as I do, and try running that script again in a new R session with freshly loaded library(slendr).

(If you get a message about setting up a Python environment again after you load the new slendr version, don't worry. You might have to run setup_env() one more time. I updated some of the Python internal bits and had to change the name of the Python environment that slendr needs/creates. This won't be necessary after this update. At least it will be fast because slendr already set up the necessary Python components in the previous setup_env() run.)


If this still gives you error even with the newest slendr version, could you:

  1. run check_env() in an R console and paste what you see?
  2. type version in your R console and paste what you see?

We will get there! 💪

ornobalam commented 2 years ago

Thanks! So I did a fresh install and still had the issue. I don't know if it's relevant but I got the following warning message: Warning message: In i.p(...) : installation of package ‘vctrs’ had non-zero exit status**

(vctrs could not update from 0.4.0 to 0.4.1)

check_env() gave me this:

Summary of the currently active Python environment:

Python binary: /Users/ornobalam/Library/r-miniconda/envs/msprime-1.1.1_tskit-0.4.1_pyslim-0.700_pandas/bin/python Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:05:47) [Clang 12.0.1 ]

slendr requirements:

version gave me this:

platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
status
major 4
minor 1.3
year 2022
month 03
day 10
svn rev 81868
language R
version.string R version 4.1.3 (2022-03-10) nickname One Push-Up

bodkan commented 2 years ago

check_env() gave me this:

Summary of the currently active Python environment:

Python binary: /Users/ornobalam/Library/r-miniconda/envs/msprime-1.1.1_tskit-0.4.1_pyslim-0.700_pandas/bin/python Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:05:47) [Clang 12.0.1 ]

slendr requirements:

  • tskit: version 0.4.1 ✓
  • msprime: version 1.1.1 ✓
  • pyslim: version 0.700 ✓

Great, so the automatic internal Python env really does appear to be set up correctly. ✅

version gave me this:

platform x86_64-apple-darwin17.0 arch x86_64 os darwin17.0 system x86_64, darwin17.0 status major 4 minor 1.3 year 2022 month 03 day 10 svn rev 81868 language R version.string R version 4.1.3 (2022-03-10) nickname One Push-Up

Also great, same recent version of R that I have. ✅

Thanks! So I did a fresh install and still had the issue. I don't know if it's relevant but I got the following warning message: Warning message: In i.p(...) : installation of package ‘vctrs’ had non-zero exit status**

(vctrs could not update from 0.4.0 to 0.4.1)

Hmmm, this actually means that one of slendr's R dependencies failed to install... usually when this happens, the dependant R package (here slendr) is actually not installed properly.

Could you try running install.packages("vctrs") and post the whole log of what you get? I don't use this package directly in slendr, but it's used by other R packages that it depends on. I have a feeling this needs to be resolved before we can proceed with the slendr issue. Perhaps we see an obvious problem with vctrs that we could look up here.

bodkan commented 2 years ago

Also, just to make the process a bit more efficient in terms of time spent on you having to answer a series of potential follow-up questions:

Could you run .libPaths() in your R session and paste what you get here? Just to make sure there are no version mismatch issues with the package libraries.

Also (but this is kind of a long shot): some software packages need to be compiled and the computer needs to have proper compiler architecture. Now, I don't actually remember to what degree this is default on a computer which already has working R etc., but it might be worth it to try to run this xcode-select --install in your normal Terminal app. Just to make sure all of those compiler tools are present (this is a useful thing to have on your Mac regardless, so there's no reason not to do this).

ornobalam commented 2 years ago

Hi! Thanks so much for helping me troubleshoot! Here's the log from install.packages("vctrs"):

installing the source package ‘vctrs’

trying URL 'https://cran.rstudio.com/src/contrib/vctrs_0.4.1.tar.gz' Content type 'application/x-gzip' length 916782 bytes (895 KB)

downloaded 895 KB

.libPaths() gives:

"/Library/Frameworks/R.framework/Versions/4.1/Resources/library"

xcode-select --install gives: xcode-select: error: command line tools are already installed, use "Software Update" to install updates

Best, Ornob

bodkan commented 2 years ago

This is extremely weird. I tried Googling the errors you get from compiling vctrs and I didn't find a single relevant hit. I don't remember the last time this happened.

This look like some very strange C/C++ compilation errors which, to me, suggest that something might be very broken with the compiler toolchain setup on your system. I'm afraid I have no idea how to debug this. This actually suggests that many other R packages in the future should break during installation.

This is particularly bad because vctrs is needed for many other R packages in the tidyverse ecosystem. I.e., if you were to install newer versions of dplyr, tibble, or related R packages, you would probably run into the same issues.

If that's the case, given how important vctrs is, I would suggest opening an issue in their repository here and asking for help there. Hopefully they would recognize the errors you're getting and might even immediately know the cure. I'm actually nearly certain they would. I know this is frustrating, but this error you just pasted here suggest that something is very broken somewhere.

Basically, just pasting there what you send in the message just above this one should be a good starting point to finding out what is the problem.


Where does this leave slendr, I'm not sure. slendr uses lots of tidyverse functionality under the hood, so I wonder if the slendr error you started with is actually not related to slendr at all, but is perhaps a manifestation of this deeper problem. :(

Because you're getting installation failure from vctrs itself, I'm afraid we don't a clear path forward with your slendr issue before you can get to install it cleanly. I.e., before you can run devtools::install_github("bodkan/slendr") without problems with installing its own dependencies on their own (vctrs in particular).

Sorry that I can't immediately help you debug this. I strongly suggest reporting this installation bug with vctrs with its authors (again, the repo is here).


In the meantime, you could also try to get a git clone of the slendr codebase and load it manually. This is a matter of last resort and should never be necessary, but before you can get vctrs (and also other tidyverse dependencies that use vctrs) installed, it might be worth trying:

# run this in any directory that's suitable, let's assume in your home directory ~/
git clone github.com/bodkan/slendr
# this will create ~/slendr

Then in R, you should be able to run:

devtools::load_all("~/slendr")

Which basically substitutes for calling library(slendr) under normal circumstances. Then you could try running the script example again (but without the library(slendr) call!)


Still, as I said, this is an extreme solution which I'd advise against using under normal circumstances.

Fixing your R installation so that you can get vctrs and other tidyverse dependencies update is important.


Please get back to me once you get the vctrs installation/compilation issue sorted out!

bodkan commented 2 years ago

I will reveal my complete ignorance here, but I just tried installing vctrs again on my machine, forcing it to compile from the sources instead of using a pre-compiled package (worked without issues on my computer), and noticed that your R tries to compile the vctrs C code with clang++ which is a C++ compiler, but in my case it's using the clang program, which is a C compiler.

This would explain the warning you're getting clang-13: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated] and also this error:

./utils.h:143:53: error: declaration of anonymous class must be a definition
SEXP s3_class_find_method(const char* generic, SEXP class, SEXP table);
^

Which indicates that the C++ compiler is interpreting the class symbol as a variable in the C source, but when the C++ compiler is trying to compile this (incorrectly in your case, I'd guess), it's interpreting it as a class keyword, leading to a syntax error.

Unfortunately, how the R compiler infrastructure decides whether a C or C++ compiler should be used for which part of the codebase is way out of my league. I have never dealt with this myself.

Hopefully the tidyverse/vctrs authors will be able to help you. Because, as I said, vctrs is a crucial components of the tidyverse ecosystem and you'd be getting the same compilation errors even with other packages.

bodkan commented 2 years ago

I also just noticed that your R compilation uses Clang installed via Homebrew, whereas on my machine R compiles vctrs via Clang compiler provided by Apple via Xcode.

Is there a reason why your Clang is set to have a precedence? Did you modify your $PATH somehow to get this?

(Really shooting in the dark here, even more suggesting that reporting this problem to the vctrs authors is a better idea before we proceed with your original slendr problem. This really looks to me like a general R issue.)

ornobalam commented 2 years ago

Thank you so much for taking such an exhaustive look at this! Your instincts about using the Homebrew-installed compilers were correct. I had the Homebrew executable directories specified in the ~/.R/makevars file (I think I had done this while troubleshooting for another package). I erased the contents of the file, and was then able to install vctrs, which fixed my original ts_load() error. Apologies for the bother about what turned out to be a very system-specific issue, and thanks again for still taking the time to help me. Excited to now try out all the cool tskit functions on R :).

bodkan commented 2 years ago

Wow. That's incredible news! I'm so happy to hear this! I have to admit, I was getting a bit worried by the potential depth of this rabbit hole. :)

No need to apologise, this was not a bother at all! In fact, it was an extremely instructive troubleshooting session, even for me as an R developer. I'm not super familiar with using compiled code in R packages and the compiler infrastructure around this. I learned a lot by digging into this.

Thanks for the quick responses and informative error logs. Have fun with the R package and please, don't hesitate to open another issue in case you run into more problems.

Also, if you're interested more than just playing with slendr but intend to use it for research, it's probably a good idea to keep an eye on the changelog file where I started tracking small and larger changes more formally after publishing the preprint. Checking that changelog and upgrading slendr with devtools once in a while is a good idea. What's under "Development version" are all the updates that will be installed when running devtools::install_github("bodkan/slendr"). It's been a long time since a real dramatic API change was introduced, so the process of upgrading is fairly safe (every dramatic change will be described in the changelog to prevent nasty surprises).

(As an aside, I have a git branch with some generalized tskit R interface, which will make it possible to run tskit operations from R on standard, non-slendr tree sequences from msprime and SLiM. Stay tuned!)

ornobalam commented 2 years ago

Thanks for the tip on the changelog file! I do intend to use it for research. I don't know if you are thinking along these lines (in terms of how you see the package being used), but I was trying to create an R-contained pipeline to run simulations and perform ABC inference when I first came across slendr. I had been using the coala package, which is great and has some advantages for ABC like being able to specify the number of iterations when simulating a model and using multiple cores. But I really like slendr's access to tskit summary statistics, the ability to sample at different times (I have some historical samples), and somewhat lazily, the fact that I don't have to scale parameters by effective population size. Will see if I can get it to work!

bodkan commented 2 years ago

I do intend to use it for research. I don't know if you are thinking along these lines (in terms of how you see the package being used), but I was trying to create an R-contained pipeline to run simulations and perform ABC inference when I first came across slendr. I had been using the coala package, which is great and has some advantages for ABC like being able to specify the number of iterations when simulating a model and using multiple cores.

That's exciting! Although slendr started as a purely SLiM-based spatial simulation framework , lots of the development that happened over the past 6 months has been targeted at turning it into a simulation engine that could be used exactly for this kind of ABC demographic inference. This is exactly why we now have the support for traditional, non-spatial coalescent simulations using the built-in msprime backend and why I have been working on generalizing the tskit interface to tree sequence processing and summary statistics towards tree sequences from arbitrary sources.

Extending the support for real-world ABC inference is definitely on my agenda, because it's very much one of the use-cases I have for slendr myself. I have some todo items related this use-case that a new student that has just started in our group will be looking into. I don't think these updates will make it to the version that's currently described in the preprint and which will ultimately be published, but it will happen very soon.

I would love to hear your thoughts on this. Perhaps we can stay in touch? Feel free to open a new issue with suggestions for the functionality you'd like to have implemented, or changes you'd like to see to the current interface, or just send me an email with your suggestions!

ornobalam commented 2 years ago

That's awesome to hear! Yes, being able to import tree sequences inferred from real data would be really useful. I would love to stay in touch about this. I am currently trying out inference on simulated data from simple models and will let you know if I think of ways slendr could better accommodate ABC. Thanks!