gnocchixyz / gnocchi

Timeseries database
Apache License 2.0
299 stars 85 forks source link

Leverage Swift DLO to store Carbonara splits #42

Open jd opened 7 years ago

jd commented 7 years ago

After chatting with @thiagodasilva, it seems possible to use Swift DLO mechanism to append new aggregated measure to Carbonara splits directly.

Creating a DLO manifest for a given split and then appending to it using PUT requests, without reading the previous data. As the Ceph driver does it.

The only question right now is that this would create a lot of small files in Swift of a few bytes (1 point is 9 bytes) for each write, so that means a lot of very small files potentially – up to 3600 files of 9 bytes for a whole split in the worst case scenario. It's not clear if Swift is able to handle that correctly.

thiagodasilva commented 7 years ago

In terms of the issue with small files in Swift, I kinda of came up with an idea, but this would require some prototyping to make sure you actually get real performance benefits out of it...and I'm also making a whole lot of assumptions to how metricd works and what kind of information it has...so bear with me...

The idea is to use the COPY request to concat the small objects for you. I think you should do this from time to time, maybe every 10th, or 100th or 10000th PUT. it would look something like this:

  1. PUT /acct/cont/temp_average # this is the DLO manifest file
  2. PUT /acct/cont/temp_average/01
  3. PUT /acct/cont/temp_average/02
  4. PUT /acct/cont/temp_average/03

Now You can just do a GET on /acct/cont/temp_average, and that will return all those small objects concatenated. To reduce the number of these small objects in the system, you could issue a COPY request every nth append iteration

  1. COPY /acct/cont/temp_average -H 'Destination: cont/temp_average/04'
  2. DELETE /acct/cont/temp_average/01
  3. DELETE /acct/cont/temp_average/02
  4. DELETE /acct/cont/temp_average/03

Now your container would have two objects: temp_average temp_average/04

Also, just FYI, the Swift community is working on a small file optimization, which would still allow you to use the DLO mechanism without having to worry about the small file performance hit.

jd commented 7 years ago

I think what you describe would work indeed. The problems are that:

Maybe it'd be simpler to wait for Swift to optimize its DLO. Is there any thing we can track to get what the status of that feature is?