CentML / centml-python-client

Apache License 2.0
1 stars 0 forks source link

Local compile server can't send large files to client #13

Open destefy opened 9 months ago

destefy commented 9 months ago

Trying to compile xlm-roberta-xl fails. This is probably because the model is either too large for the server to unpack or too large to send over

When the server sends the compiled model to the client to download, the server side gives the following error. h11._util.LocalProtocolError: Too much data for declared Content-Length

On the client side: requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead([4957798400](tel:4957798400) bytes read, 10491 more expected)', IncompleteRead([4957798400](tel:4957798400) bytes read, 10491 more expected))

┆Issue is synchronized with this Notion page by Unito

destefy commented 9 months ago

Even after implementing GZip HTTP compression, when the server tries to load the compiled model it gives: MemoryError: Can not allocate 50 MiB from cuda:0 device. Status of cuda:0 memory pool Allocated: 7596 MiB Peak: 7596 MiB Reserved: 0 Bytes Active: 7596 MiB

destefy commented 9 months ago

At it's core this seems to be an issue with FastAPI imposing a size restriction on files send (HTTP has no such restriction). We can get around this for now for the xlm-roberta-xl case with compression, but it may come up again with an even larger model.

A better solution would be to send the compiled graph from server to client in chunks. With larger models it's possible there is also this problem when sending from client to server.

This also comes up when sending from client to server. While the GraphModule has to be in memory, we should keep the serialized model we get with torch.save/pickle on disk. Then we send it from disk or just load it in chunks

anandj91 commented 6 months ago

https://requests.readthedocs.io/en/latest/user/advanced/#streaming-uploads

destefy commented 2 months ago

GZip

I don't think I'm properly using the gzip middleware. This says I need a header link