serizba / cppflow

Run TensorFlow models in C++ without installation and without Bazel
https://serizba.github.io/cppflow/
MIT License
774 stars 177 forks source link

Option for having output tensors allocated in device memory? #226

Open jkrause1 opened 1 year ago

jkrause1 commented 1 year ago

Hello,

I'm loading a model of a frozen graph and run it. I then check for the device of the resulting output-tensors and they all return to me /job:localhost/replica:0/task:0/device:CPU:0 implying they reside in the host memory. I don't know if this is a result of how the graph is constructed, or if there are options missing I have to set, but I would prefer if they stay in the device memory, so I can access and process the data further via CUDA.

bmiftah commented 1 year ago

Hi , were you able to solve this ? I faced problem for a related task, meaning loading frozen_model - In my case , the error I got is terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc I can see it has something to do with memory. When I was loading the original model , there was no issue at all. I faced this only after trying the frozen model. The two model are nearly same in size but there could be difference in structure of the graph though which I didn't check.

Here is how I load my frozen model : cppflow::model model("Froozen_model_dir", cppflow::model::TYPE::FROZEN_GRAPH);

and here is how I call inference on it with sampple input
output_tensor = model(input_1);

and I got this :

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

Any tip on how to solve this