nspcc-dev / neofs-sdk-go

Go implementation of NeoFS SDK
Apache License 2.0
5 stars 14 forks source link

slicer: No way to specify known payload size #540

Closed cthulhu-rider closed 7 months ago

cthulhu-rider commented 8 months ago

Is your feature request related to a problem? Please describe.

in current implementation slicer always allocates buffer of size passed into https://pkg.go.dev/github.com/nspcc-dev/neofs-sdk-go/object/slicer#Options.SetObjectPayloadLimit. This leads to unnecessary memory consumption when the size of the payload is known in advance and it is less than the specified limit

Describe the solution you'd like

when payload size is known, just provide SetPayloadSize(N). If N will be less than SetObjectPayloadLimit - we allocate N

in NeoFS API protocol we have https://github.com/nspcc-dev/neofs-api/blob/cae99c9c6328250666f9e6944166cd8a4cc54cf8/object/types.proto#L93-L95. According to this, payload_length=0 stands for objects with empty payload. Current SDK does not declare this behavior, which means that some existing applications leave the field unset. We must take this into account and maintain the behavior. Those who already set the payload size are clearly expected to provide the correct value

Describe alternatives you've considered

in addition, for functions accepting io.Reader when SetPayloadSize was not called, we may resolve size for some types (bytes.Reader, os.File, etc.). This is relatively cheap and will cover some cases

Additional context

carpawell commented 8 months ago

While it can be done, I am not sure if we can rely on it, a client can use it or not. So I would try to find some optimizations via memory-sharing (sync.Pool? not be afraid of buff growing on demand?) cause client's help is appreciated but without it, we should be able to do smth too.

cthulhu-rider commented 8 months ago

if client use it - it must be correct. Protection from fake abuse is on different level. Now we always alloc MaxObjectSize. After we'll allocate at most MaxObjectSize. Overall, this will be more efficient.

application level allocators certainly need to be supported, but they tune the way to allocate a fixed amount of memory while this issue is about determining the amount

carpawell commented 8 months ago

Protection from fake abuse is on different level.

What is fake abuse? I think we are talking about different things.

Overall, this will be more efficient.

Overall, this can not be ensured. A client may not know the total size, what's then? Create a new issue?

while this issue is about determining the amount

Dont mind but it wont work for the apps that just update to the RC12 without explicit changes. Or for the apps that just do not want to specify object size. Or cant if they do not know it. Just complain that there is no similar issue about memory consumption by the slicer in general. IMO, we have to be flexible anyway, with or without this issue resolved.

carpawell commented 8 months ago

BTW, simple sync.Pool with 64mb []byte for 1k concurrent object PUTs is still 64gb ram usage which is too far from 1tb limit from the https://github.com/nspcc-dev/neofs-node/issues/2686. Not sure if 1k objects is a small or big number.

cthulhu-rider commented 7 months ago