attic-labs / noms

The versioned, forkable, syncable database
Apache License 2.0
7.45k stars 267 forks source link

Enable walking refs in a Chunk without decoding to a Value #3738

Closed cmasone-attic closed 6 years ago

cmasone-attic commented 6 years ago

This patch introduces WalkRefs(c chunks.Chunk, cb RefCallback), which allows the refs in a Chunk to be walked without first decoding the Chunk into a Value. It's useful in e.g. datas.Pull(), which needs to walk the Chunk graph, but doesn't actually care about the Values.

On my macbook, syncing e.g. a large dataset imported from a .csv file is about 4x faster.

cmasone-attic commented 6 years ago

Hey, @arv this doesn't have all the testing I want, and isn't fully commented or cleaned up, but how do you feel about the approach I'm taking to sharing code between refWalker and valueDecoder?

arv commented 6 years ago

This seems reasonable.

It does feel like we should have created a visitor interface that we implement here. We would then implement the no op call back as a struct and have walk refs embed and override visitRef.

I'm not say that you need to do that here but I feel like this code is getting a bit repetitive. I'm also not sure if that helps the readValue, skipValue cases though.

cmasone-attic commented 6 years ago

I'm concerned about boxing impacting performance, because these calls happen so very, very frequently. That's a rational concern, right? We could try a visitor interface of some kind and see if there's a slow down compared to the solution here...

Do you have any thoughts on that, @rafael-atticlabs ? On Mon, Sep 25, 2017 at 12:58 PM Erik Arvidsson notifications@github.com wrote:

This seems reasonable.

It does feel like we should have created a visitor interface that we implement here. We would then implement the no op call back as a struct and have walk refs embed and override visitRef.

I'm not say that you need to do that here but I feel like this code is getting a bit repetitive. I'm also not sure if that helps the readValue, skipValue cases though.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/attic-labs/noms/pull/3738#issuecomment-331945052, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnImgAdaQUvisIuyaLLPv__Ngb7dbetks5sl9uogaJpZM4Phjqj .

cmasone-attic commented 6 years ago

PTAL

arv commented 6 years ago

Please assign to Raf

cmasone-attic commented 6 years ago

Ok, reassigned to @rafael-atticlabs