jssmith / ssqlite

Serverless SQLite Experiments
6 stars 3 forks source link

Socket read errors in Lambda #13

Closed jssmith closed 6 years ago

jssmith commented 6 years ago

When running TPC-C in Lambda error messages sometimes appear that are not seen when running in EC2, e.g.:

read 192.168.1.57/tpcc-nfs offset:5836800 bytes:4096 status server socket read error

this leads to SQLite failed transactions such as:

[WARNING]   2017-12-18T20:00:06.320Z    0afde782-e42e-11e7-b689-e77e850ec1ca    Failed to execute Transaction 'NEW_ORDER': SQL logic error
convolvatron commented 6 years ago

that almost certainly means that the connection is dropped. we can add that to the retry path

jssmith commented 6 years ago

Ok. What do you think about possible fragmentation? I'm not sure what sort of network layers / abstractions we might be sitting on top of in this environment. I will try wrapping the read in a loop.

convolvatron commented 6 years ago

I might be wrong because I haven't run without blocking turned off in forever, but I think* we should be waiting until all the bytes have been received.

unless I misunderstand your question, but by all means dig into it. one of the things I'd like to do is have a sprintf thing so we can add additional information to the status strings

jssmith commented 6 years ago

I'm pretty sure I found it—we don't always read a full block. E.g., after adding logging:

read 192.168.1.57/tpcc-nfs offset:99737600 bytes:4096 partial read 2892 of 4200

On the logfix branch.

convolvatron commented 6 years ago

sorry, didn't think that was supposed to happen without fionbio

jssmith commented 6 years ago

All good.