Closed lschaupp closed 1 week ago
AFAIK you can't send it as is. It needs to be serialized.
@agunapal Would it be possible to have some kind of pinned memory where the torch arrays are loaded, and then simply sharing the memory pointer via the request? I have larger image files to handle - and trying to find the most efficient way on systems with low CPU performance.
If you are doing this locally, should be possible using shared memory. However, if you have a cuda tensor, I don't think it works. Atleast, it didn't work previously
If you are doing this locally, should be possible using shared memory. However, if you have a cuda tensor, I don't think it works. Atleast, it didn't work previously
Thanks for the info. It could work with tensors on shared memory ("cpu") based on your response. That would be massively better than sending the file across imho. Do we have any working example?
Edit: It is a local instance :)
I want to send a torch (cuda) array via python request to the inference API. Is that possible?