EricArcher / strataG

strataG is a toolkit for haploid sequence and multilocus genetic data summaries, and analyses of population structure.
25 stars 11 forks source link

having an issue with fsc2gtypes #63

Open stranda opened 1 year ago

stranda commented 1 year ago

Hi Eric, I was using strataG installed from github and have been having an issue with fsc2gtypes(). The current version is giving this error when I try to simulate some dna seqeunces and read them back into R:

Error in result[, 1] : incorrect number of dimensions

This is using fsc27 as the simulator. Previous versions of strataG were working in this situation, but they were generating a warning from dplyr about specifying .cols when using across().

To reproduce, I've attached a binary format file (saved with saveRDS()) of the settings run by fscRun. It's stashed inside a zip archive so github will allow me to upload example.zip

Here are the actual lines I've been running:

runout=fscRun(simp,exec=fsc_exec,dna.to.snp=F) res=fsc2gtypes(runout,marker='dna',as.genotypes=F,concat.dna=F)

where 'simp' is the object in the attached file and fsc_exec="fsc27093".

This isn't a huge problem for me, because I forked an older version that works but gives the 'across()' warning, but I wanted to let you have a reproducible example

EricArcher commented 1 year ago

Hiya Allan!

Would you send me the script used to create simp? It looks like it is the output from fscWrite() and that means I don't have access to the .par file it writes.

stranda commented 1 year ago

Sorry about that, I was trying to reduce complication yet be complete. And of course, I've got a complicated setup:)

The attached zip file produces a folder hierarchy with data that drive the simulations. If you cd to the battr/src subfolder there are a couple of files. The datafiles.R file has lots of initialization and environment vars. You will need to change 2 for your setup: 1) line 11 'abspath' this needs to be setup for your system 2) line 28 'fsc_exec this needs to reflect the fsc27 executable in your PATH

This also requires that you run:

devtools::install_github("stranda/testInvPath")

to install the package upon which the simulation depends. The function in this package called 'createCoalObject()' actually uses fscWrite and the functions that build an object.

Once the package testInvPath is installed and the couple of changes in datafiles.R are made, you should be able to just run 'runReps.R' and it should run one simulation and calculate one set of summary stats (if it gets past fsc2gtypes()).

Interactively, you could run lines 1-31 of runReps.R and the lines 34-37 to build the coalescent simulation object and then lines 41-43 to run the simulation. line 43 is addressing an issue I was having where I wanted to simulate one sequence per individual, but was having trouble unless I simulated two and dropped one after. Based on some testing this morning, that still seems to be a problem I'm seeing.

Sorry for the Byzantine setup. I'm setting up a system to test alternative invasion pathways with genetic data from a large number of species and the complications arise from dealing with the idiosyncratic nature of empirical datasets.

Again not a rush, I'm chugging ahead with my forked strataG

take care allan better_example.zip

EricArcher commented 1 year ago

Got it. I'll put it on my ToDo list. Let me know if you run into any more problems in the meantime.