Open ignacio82 opened 5 years ago
thanks for the report, I'll take a look.
hmm... we can solve the errors such as NotFoundError: ./libdevice.compute_30.10.bc not found
by copying /usr/local/cuda-9.0
from the rocker/cuda-dev
image, but then I seem to be running up against https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc#L485-L489 instead.
Not exactly clear to me how to cherrypick ptxas 9.2.88 though.
Bumping all of cuda to 9.2.88 seems to break tensorflow, as it looks like the binaries installed by pip (for 0.12.0) are build only for cuda 9.0.
A second error I encounter, e.g. via either the virtualenv install route or in building on tensorflow/tensorflow:1.13.1-gpu-py3
is ValueError: Tensor conversion requested dtype int64 for Tensor with dtype int32
. Longer trace below.
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Tensor conversion requested dtype int64 for Tensor with dtype int32: 'Tensor("Placeholder_13:0", dtype=int32)'
Detailed traceback:
File "/usr/local/lib/python3.5/dist-packages/tensorflow_probability/python/mcmc/sample.py", line 216, in sample_chain
name="num_steps_between_results")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1039, in convert_to_tensor
return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1097, in convert_to_tensor_v2
as_ref=False)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1175, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 977, in _TensorTensorConversionFunction
(dtype.name, t.dtype.name, str(t)))
still digging...
I think that last error just means you have the CRAN release of greta, but need the current GitHub version.
Something changed in the most recent Tensorflow Probability release, and the greta-side patch hasn't yet made its way to CRAN.
@goldingn thanks Nick, that's the ticket!
@ignacio82 Once rocker/tensorflow-gpu
builds (probably by tomorrow, or just docker build
locally), you should be able to do a remotes::install_github("greta-dev/greta")
and then gpu-accelerated greta should be working now.
Thanks again for the bug report, hadn't gotten around to testing greta, it's still somewhat early days for these ML images.
Thanks! A couple of question:
rocker/tensorflow-gpu
but I think i should use rocker/ml-gpu:latest
. With the former i got a mesage saying that i needed to install tensor flow probability. Is that right or should I use rocker/tensorflow-gpu
?/usr/local/lib/python3.5/dist-packages/numpy/lib/type_check.py:546: DeprecationWarning: np.asscalar(a) is deprecated since NumPy v1.16, use a.item() instead
'a.item() instead', DeprecationWarning, stacklevel=1)
Is this a problem that the greta developers need to fix?
@ignacio82 Right, I moved tensorflow-probability
into the tensorflow
image now since it seemed more logical to keep those together, but the latest rocker/tensorflow-gpu
instance hasn't finished building. We're still figuring out the right organizational modularity.
Re the DeprecationWarning
, yeah, I see that too, @goldingn can probably give us more insight on that but I don't think it's much of a problem.
Not sure this ought to be a different error or not, but I get a strange error when trying greta with the ml-gpu
container.
remotes::install_github("greta-dev/greta")
rm(list=ls())
library(reticulate)
py_discover_config()
use_python("/opt/virtualenvs/r-tensorflow/bin/python")
use_virtualenv("/opt/virtualenvs/r-tensorflow/", required=T)
library(greta)
library(DiagrammeR)
library(bayesplot)
library(tidyverse)
length_of_data <- 100
sd_eps <- pi^exp(1)
intercept <- -5.0
slope <- pi
x <- seq(-10*pi, 10*pi, length.out = length_of_data)
y <- intercept + slope*x + rnorm(n = length_of_data, mean = 0, sd = sd_eps)
data <- data_frame(y = y, x = x)
intercept_p <- uniform(-10, 10)
sd_eps_p <- uniform(0, 50)
slope_p <- uniform(0, 10)
mean_y <- intercept_p+slope_p*x
distribution(y) <- normal(mean_y, sd_eps_p)
our_model <- model(intercept_p, slope_p, sd_eps_p)
num_samples <- 1000
param_draws <- mcmc(our_model, n_samples = num_samples, warmup = num_samples / 10)
that gives the error
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Tensor conversion requested dtype int64 for Tensor with dtype int32:
'Tensor("Placeholder_13:0", dtype=int32)'
So greta
requires pretty careful coordination between versions of CUDA, tensorflow, and greta
itself. I think this particular is due to using the most recent dev version of greta with an older tensorflow (see https://github.com/greta-dev/greta/issues/248).
We're still exploring the best way to help users triangulate these versions. (The current tensorflow-gpu
image is iirc still on cuda 9.0, which is too old for tensorflow > 1.13 which is required for greta > 0.3.0 or so? don't quote me on those versions).
Can you try testing on rocker/ml:cuda-10.0
? (Note that it should already have greta
installed).
I was trying to play with greta using this container but I'm getting an error. This is what I am doing: