Closed bpetros95 closed 1 year ago
Please provide a completely reproducible example including your exact command-line call to BEAST, your XML and the output from beast -beagle_info
.
Thank you!
XML: rsvb_geo_samp_aligned_100M.txt
output of beast -beagle_info
:
command-line call to BEAST: beast -beagle_GPU -beagle_double -beagle_order 1,1,2 -overwrite ${XML}
log of analysis (terminated early): terminated_beast_log.txt
Use beast -beagle_order 0,1 -overwrite ${XML}
to put first (multi-partition) data-likelihood onto CPU and second (host) data-likelihood onto GPU. For user-issues and help, please post to the https://groups.google.com/g/beast-users
list-serv, as this helps better engage the whole community.
beast -beagle_order 0,1 -overwrite ${XML}
works (nt data on CPU, host data on GPU 1), but beast -beagle_order 1,2 -overwrite ${XML}
does not (puts nt data and host data on GPU 1, ignores GPU 2). Same XML file as above.
Output of -beagle_info
checked again, same as above.
Hi @msuchard thanks for the help so far. I have some more data points here that hopefully shed a bit more light on the issue. These are made with the same input XML and various ways of attempting (and failing) to place the geographic partition on its own dedicated GPU (also trying two different versions of beast/beagle as well).
The partitions in that XML are as such:
Read alignment: alignment
Sequences = 1347
Sites = 963
Datatype = nucleotide
Site patterns 'CP1+2.patterns' created by merging 2 pattern lists
pattern count = 539
Site patterns 'CP3.patterns' created from positions 3-963 of alignment 'alignment'
only using every 3 site
unique pattern count = 314
Read attribute patterns, 'region.pattern' for attribute, region
Creating the tree model, 'treeModel'
taxon count = 1347
tree height = 16.656191930600656
Just to confirm, I am interpreting the Using BEAGLE TreeLikelihood
section as the thing that represents the hardware assignment for where geographic inference is running--let me know if I'm wrong about that, I personally don't have experience with phylogeo. Assuming that's right, here's the matrix of things I've tried:
# V100 GPUs | requested behavior | beast & beagle version | actual hardware assignments |
---|---|---|---|
1 | -beagle_order 0,1 |
1.10.5pre_thorney_0.1.2 on 4.0.0 | 0,1,0 (Data,Data,Tree) |
1 | -beagle_order 0,1 -beagle_multipartition on |
1.10.5pre_thorney_0.1.2 on 4.0.0 | 0,1,0 |
1 | -beagle_order 0,1,1 |
1.10.5pre_thorney_0.1.2 on 4.0.0 | 0,1,0 |
4 | -beagle_order 1,2,3 |
1.10.5pre_thorney_0.1.2 on 4.0.0 | 1,1 (Multipart Data, Tree) |
1 | -beagle_order 0,1 |
1.10.4 on 3.1.2 | 0,1,0 |
1 | -beagle_order 0,0,1 |
1.10.4 on 3.1.2 | 0,0,0 |
1 | -beagle_order 0,1,1 |
1.10.4 on 3.1.2 | 0,1,0 |
I think the desired outcome here is for the actual hardware assignments to be either Multipartition Data = 0, Tree = 1; CP1+2 = 0, CP3 = 0, Tree = 1; or MPData = 1, Tree = 2; etc. But I can't get it to honor the beagle_order
request, and whether or not it decides to multipartition the nucleotide data seems both unpredictable and not correlated to whether I specify beagle_multipartition
.
A tarball of all the stdout/stderr files from these runs can be temporarily found at gs://viral-public-temp-30d/beast/beast_logs_treelikelihood_beagleorder.tar.gz. Abbreviated example of the stdout corresponding to the first row in the above table (-beagle_order 0,1
, latest BEAST/BEAGLE) is here, please let me know if I'm not interpreting its hardware assignments correctly:
Using BEAGLE DataLikelihood Delegate
Using BEAGLE resource 0: CPU (x86_64)
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_SSE THREADING_CPP PROCESSOR_CPU FRAMEWORK_CPU PREORDER_TRANSPOSE_MANUAL
Ignoring preOrder partials in tree likelihood.
Ignoring ambiguities in tree likelihood.
With 539 unique site patterns.
Using rescaling scheme : dynamic (rescaling every 100 evaluations, delay rescaling until first overflow)
Using TreeDataLikelihood
Branch rate model used: strictClockBranchRates
Using BEAGLE DataLikelihood Delegate
Using BEAGLE resource 1: Tesla V100-SXM2-16GB
Global memory (MB): 16161
Clock speed (Ghz): 1.53
Number of cores: 10240
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_CPP THREADING_NONE PROCESSOR_GPU FRAMEWORK_CUDA PREORDER_TRANSPOSE_MANUAL
Ignoring preOrder partials in tree likelihood.
Ignoring ambiguities in tree likelihood.
With 314 unique site patterns.
Using rescaling scheme : dynamic (rescaling every 100 evaluations, delay rescaling until first overflow)
Using TreeDataLikelihood
Branch rate model used: strictClockBranchRates
Creating state frequencies model 'region.frequencies': Initial frequencies = {0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1}
General Substitution Model (stateCount=10)
Using BSSVS Complex Substitution Model
Creating site rate model.
Creating state frequencies model 'region.root.frequencies': Initial frequencies = {0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1}
Using BEAGLE TreeLikelihood
Branch rate model used: strictClockBranchRates
Using BEAGLE resource 0: CPU (x86_64)
with instance flags: PRECISION_DOUBLE COMPUTATION_SYNCH EIGEN_COMPLEX SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_CPP PROCESSOR_CPU FRAMEWORK_CPU PREORDER_TRANSPOSE_MANUAL
Ignoring ambiguities in tree likelihood.
With 1 unique site patterns.
Using rescaling scheme : delayed (delay rescaling until first overflow)
Optimization Schedule: log
Creating CTMC Scale Reference Prior model.
Acting on subtree of size 1347
Constructing a cache around likelihood 'null', signal = region.rates
Hello,
I am running BEAST with CPU + 2 v100 GPUs. My model has 2 nt partitions (CP1+2, CP3) and 1 geographic partition.
Under any beagle_order (1,1,2; 0,0,1; 2,2,1, etc) BEAGLE places the geographic partition on the same machine as the nt partitions. For example, with 1, 1, 2 as the specified order, BEAGLE places all partitions on the 1st GPU.
Attaching the .log file for beagle_order 0,0,1. I have already confirmed beast-log-file.txt with beagle_info that the CPU and both GPUs are recognized by the software.