nspcc-dev / neofs-api

NeoFS API documentation and proto files declaration
Apache License 2.0
11 stars 15 forks source link

Make it easy to seek through large objects #264

Closed roman-khimov closed 9 months ago

roman-khimov commented 1 year ago

There are several ways to do this:

  1. Store the size of each chunk in the link object along with the hash. It has the most flexibility, but it also means additional per-part overhead to store this.
  2. Keep link object as is, but store the offset from the beginning in each regular object and at least allow to bisect into some particular offset instead of looping through the whole list (10 HEADs is much better than 1000 HEADs).
  3. Demand individual parts to be of the same size, specify this size in the split data header. It's pretty easy to do, we probably HEAD header.split.previous anyway, so checking that the size of the object is the same as the previous one costs nothing.
carpawell commented 10 months ago

I vote for the first one as the most explicit object-handling information to me: helper object has helper info. An offset inside every object seems redundant info for the object itself, it does not take part in the assembly and not required for the storing payload. Demanding the same size would be the best and the easiest to me but we have not done it yet and it looks like a feature, mb we do not want to be so strict.

roman-khimov commented 10 months ago

Storing offset in a small object costs nothing, storing sizes in the link object does have some cost to it (~10% per-object additional overhead). We can do both though.

Equal parts are not compatible with non-reslicing S3 multipart handler (https://github.com/nspcc-dev/neofs-s3-gw/issues/843).