Windows support - Githubissues

cranic commented 12 years ago

(Was "Windows build fails", ed:RV)

Actual NPM error log: https://gist.github.com/3730932

I just found something about compiling leveldb in Windows: https://groups.google.com/d/msg/leveldb/VuECZMnsob4/tROLqJq_JcEJ

And here are the aditional packages that are needed: http://www.boost.org/users/download

And some more information from official source: http://code.google.com/p/leveldb/source/browse/WINDOWS?name=windows

rvagg commented 12 years ago

Spent some time in Windows today trying to get my head around the issues involved here. Not as straightforward as I hoped unfortunately.

While using Boost is a possibility, it's huge and I certainly wouldn't want to bundle it just to get a Windows compile. It'd be nice if only the useful bits could be extracted but I haven't looked at it.
My GYP files for LevelDB and Snappy started from the ones that Chromium uses and Chromium compiles in Windows of course. They do it by abstracting out the non-portable bits in most libraries they include. It may be possible to pull out the important bits and reuse those. See here, the important bits are in Chromium's "base", references are in *env_chromium.cc" and "leveldatabase.gyp"
There is a leveldbwin project that did a win32 port, looks a little old but may be useful.
I'm not sure exactly what this is, or if that guy has any relation to the LevelDB project but it looks like a fork that includes Windows build components, including some stuff snaffled from the above project. Looks more up to date but unfortunately also looks like the Windows stuff is munged in with the LevelDB stuff so I'm not sure how hard it would be to pull it out.

And that's all I have for now. Would be happy for others to pitch in here because this isn't a very high priority for me.

ghost commented 11 years ago

someone is writting a pouch db layer to use your levelup code. This means a 100% nodejs based databae using couchdb semantics.

its over here: https://github.com/chesles/pouchdb

rvagg commented 11 years ago

OK peeps, the real solution to this is to simply use libuv since we have direct access to it anyway! I've already completed the basics of a leveldb port to libuv but unfortunately it requires Node 0.9 because of a required feature only introduced to libuv a month ago. So, maybe by the time 0.8 comes to an end we'll have proper Windows support in LevelUP.

chesles commented 11 years ago

Hey that's great news!

cranic commented 11 years ago

Awesome! Could you create an unstable branch with Node.js 0.9? This way we could start testings in our scripts :)

rvagg commented 11 years ago

I'll keep you informed here when I have something interesting, I have a branch on the go but it's only a start.

rvagg commented 11 years ago

A quick update on this. I have a libuv port.h working across platforms, but unfortunately there's a lot more work for an env.cc required for Windows/libuv than I suspected. There's even a few things we can't do with libuv and will have to be done manually for different platforms. In summary, I ran out of steam, but there is a libuv branch on the go.

ghost commented 11 years ago

it looks like Ben closed the LubUV issue requesting a different API to file seeking.

So whats the way forward for getting a windows port now ? I assume it means adapting the code to us the API as it stands ?

rvagg commented 11 years ago

yeah, it'll just mean an #ifdef WIN32 blah blah to switch between Posix and Windows in the env.cc file. A lot of the rest may be able to be done by libuv as is.

ghost commented 11 years ago

i see.

Of course i would want to eliminate the need for the conditionals too.

If you can get a build going in travis woudl be awesome. I will be a happy tester.

I am also looking for libraries to allow levelup to be distributed like couch db is. If you know of any that would be great.

i see no reason why we cant get something like riak working

On 10 February 2013 12:19, Rod Vagg notifications@github.com wrote:

yeah, it'll just mean an #ifdef WIN32 blah blah to switch between Posix and Windows in the env.cc file. A lot of the rest may be able to be done by libuv as is.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-13347476..

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

juliangruber commented 11 years ago

For distribution you can check out level-scuttlebutt by dominictarr.

Re riak: we mustn't forget that leveldb is just an unopinionated storage engine. There are versions or riak that use leveldb though.

ghost commented 11 years ago

thanks

I know what you mean. I agree its great to keep levelup as an agnostic storage engine.

I was not suggesting we implement distributed storage in levleup. But more looking for github projects that implement it on top of levelup.

I see that scuttle butt has map reduce to allow many storage instances to be used.

But i am more thinking of data replication, key ring based storage, leveldb server to handle connection pooling etc

it would be really great to add a list of these to the main level db web page maybe ?

On 10 February 2013 12:36, Julian Gruber notifications@github.com wrote:

For distribution you can check out level-scuttlebutt by dominictarr.

Re riak: we mustn't forget that leveldb is just an unopinionated storage engine. There are versions or riak that use leveldb though.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-13347647..

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

ghost commented 11 years ago

Ok i correct myself. you have the Modules page in the wiki :)

On 10 February 2013 13:04, Ged Wed gedw99@gmail.com wrote:

thanks

I know what you mean. I agree its great to keep levelup as an agnostic storage engine.

I was not suggesting we implement distributed storage in levleup. But more looking for github projects that implement it on top of levelup.

I see that scuttle butt has map reduce to allow many storage instances to be used.

But i am more thinking of data replication, key ring based storage, leveldb server to handle connection pooling etc

it would be really great to add a list of these to the main level db web page maybe ?

On 10 February 2013 12:36, Julian Gruber notifications@github.com wrote:

For distribution you can check out level-scuttlebutt by dominictarr.

Re riak: we mustn't forget that leveldb is just an unopinionated storage engine. There are versions or riak that use leveldb though.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-13347647..

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

dominictarr commented 11 years ago

level-scuttlebutt takes one opinionated approach. I have been thinking of some others, but it depends on the particular use-case you have.. @gedw99 can you start a separate issue to discuss replication in, describing your use-case?

ghost commented 11 years ago

I am building a 3D cad modelling system and tons of json data I need to store on the servers in many data centers. I run offline using indexdb and so need to also sync.

Originally I used pouchdb and couxhdb.

But I want to replace all of it with level dB.

G On 11/02/2013 8:45 AM, "Dominic Tarr" notifications@github.com wrote:

level-scuttlebutt takes one opinionated approach. I have been thinking of some others, but it depends on the particular use-case you have.. @gedw99 https://github.com/gedw99 can you start a separate issue to discuss replication in, describing your use-case?

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-13370856..

kevinswiber commented 11 years ago

@rvagg Is there a list of changes required before Windows support is working? I want to use LevelDB for a project I'm working on, but cross-platform support is a promise I'd like to keep. I'd be willing to help put some time into this if needed. There will be a slight learning curve for me as I get up to speed. If you're looking for help, hit me up!

rvagg commented 11 years ago

Pretty much all of this needs to be implemented. See the POSIX version. Tho we're using libuv to do it so some of it is not as hard as it looks. I'm planning on committing an initial version of that file soon(ish), once that's done then feel free to tinker too. Use https://github.com/rvagg/leveldb/ instead though, my libuv-port branch is where the action's at.

kevinswiber commented 11 years ago

Great! In the meantime, I've started working on a Bitcask clone that I'm hoping to keep 100% JavaScript. For the project I'm working on, I should be able to offer both strategies (Bitcask clone, LevelDB) once cross-platform compatibility is complete for LevelDB. Once I'm done with the Bitcask-y stuff, I should have some availability to look at this. (I don't think it will take much longer. It's delightfully simple.)

Thanks again for the info. I'll be on the lookout for that commit.

dominictarr commented 11 years ago

@kevinswiber excellent! reading the document you link, bitcask sounds architecturally very similar to leveldb - except that since it uses an in memory hash-table as an index it doesn't seem like it would support range queries - just straight gets.

Is that correct?

Are the inactive files stored sorted or unsorted?

kevinswiber commented 11 years ago

@dominictarr The keys are all unsorted, so range queries aren't as easy. To accomplish this with Basho's Bitcask implementation, we'd have to iterate over the keys until all matches are found or the end is reached.

For my clone-ish implementation, I'm thinking of doing the key-sort (perhaps in the hint file), to hopefully reap the SSTable benefit in this design. Then range queries should perform better.

(We're way off-topic, though. I'm @kevinswiber on Twitter and kswiber at gmail if you want to take it off the levelup issues list.)

ghost commented 11 years ago

Hey Kevin are you thinking of building a RIAK like system on top of LevelDB. I have been thinking the same.

G

On 12 March 2013 14:47, Kevin Swiber notifications@github.com wrote:

@dominictarr https://github.com/dominictarr The keys are all unsorted, so range queries aren't as easy. To accomplish this with Basho's Bitcask implementation, we'd have to iterate over the keys until all matches are found or the end is reached.

For my clone-ish implementation, I'm thinking of doing the key-sort (perhaps in the hint file), to hopefully reap the SSTable benefit in this design. Then range queries should perform better.

(We're way off-topic, though. I'm @kevinswiberhttps://github.com/kevinswiberon Twitter and kswiber at gmail if you want to take it off the levelup issues list.)

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-14775696 .

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

kevinswiber commented 11 years ago

@gedw99 Yeah, something like that. Long-term, I'm planning to build a distributed key-value database that can be easily embedded in Node.js apps. My current use case is to add built-in caching support for an HTTP proxy and origin server I'm building.

Some constraints are:

Keep it cross-platform
Avoid native modules as much as possible
Make it fast
Make it embeddable.

Thinking down the road, I'll give @dominictarr's Scuttlebutt implementation a go for gossiping nodes. But virtual bucket distribution, gossip protocol, consistent hashing, vector clocks, hinted handoff, anti-entropy measures... none of these matter without a solid database structure. So that's the first priority.

I'd prefer to default to a 100% JavaScript database as the default, while allowing folks to branch out to other implementations as needed (such as LevelDB).

ghost commented 11 years ago

Keven,

thats exactly what i have been wanting to do, but no time.

I reckon i can help test and do some bug fixing once the base architecture is in place.

One thing that i feel is important to incorporate is offline browser support for my use cases anyway. so that means:

queries can run on client or browser
sync support.

is this something on your roadmap too i wonder ?

On 12 March 2013 16:56, Kevin Swiber notifications@github.com wrote:

@gedw99 https://github.com/gedw99 Yeah, something like that. Long-term, I'm planning to build a distributed key-value database that can be easily embedded in Node.js apps. My current use case is to add built-in caching support for an HTTP proxy and origin server https://github.com/argo/argoI'm building.

Some constraints are:

Keep it cross-platform

Avoid native modules as much as possible

Make it fast

Make it embeddable.

Thinking down the road, I'll give @dominictarrhttps://github.com/dominictarr's Scuttlebutt implementation a go for gossiping nodes. But virtual bucket distribution, gossip protocol, consistent hashing, vector clocks, hinted handoff, anti-entropy measures... none of these matter without a solid database structure. So that's the first priority.

I'd prefer to default to a 100% JavaScript database as the default, while allowing folks to branch out to other implementations as needed (such as LevelDB).

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-14783851 .

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

kevinswiber commented 11 years ago

@gedw99 Yep. The idea is to remain decentralized, peer-to-peer. Availability first, consistency second. Hit me up on Twitter or e-mail if you want to talk further. My Issues list etiquette is currently suspect. ;)

dominictarr commented 11 years ago

@kevinswiber @gedw99 I assure you that such discussions are infact on topic in a levelup issues! as it's pretty much become a clearing house for such things

I started working on a pure js leveldb clone, with the leveldown api I have: https://github.com/dominictarr/json-logdb (which is simple, and works) and have started http://github.com/dominictarr/json-sst (which was more difficult than I expected, but I have a few ideas for when I next get time)

My plan was to use a new line separated json format for simplicity, and then swap it out with something binary compatible with leveldb later.

I think @mixu also needs to be a part of this discussion!

Raynos commented 11 years ago

REWRITE ALL THE DATABASES IN JAVASCRIPT

Looks like a normal levelup issue to me.

kevinswiber commented 11 years ago

@dominictarr Awesome. If you can throw your thoughts into Issues for json-sst and json-logdb, I'd love to take a look.

For my Bitcask effort (which I just started yesterday), I already have parts of the in-memory keydir and the active file puts/gets/deletes working. My next step is to look at log file cycling, merging, and hint file management. It will probably take more studying and tinkering than real development time (I'm hoping).

I'm not sure there will be anything reusable in there, but it should be an interesting science experiment we can examine.

rvagg commented 11 years ago

/cc @chilts cause he's also working in this area. https://github.com/chilts/level-dyno

chilts commented 11 years ago

Of course, this "Windows Support" issue is the wrong place for me to describe the project, but the short of it is to try and replicate similar functionality to DynamoDB. I think it's different to @dominictarr's level-scuttlebutt since each server won't hold 100% of the data. Also, the algorithm for the eventual consistency is different.

Anyway, just a work in progress at the moment and level-dyno is just a small beginning. :)

kevinswiber commented 11 years ago

For those interested in the Bitcask clone I mentioned, I open-sourced the start of it here:

https://github.com/argo/medea

I have some plans for it that stray from Bitcask (such as indexing map reduce operations).

Right now, basic get, put, and remove functionality is there. You can set a maxFileSize, which will create new data files based on growth. All keys are stored in memory on medea.open. The keydir holds information regarding the value's data file and offset.

You can do basic (non-cached) map-reduce operations using medea.mapReduce.

I haven't implemented the separate merge process yet that cleans up the log files, but I thought I'd open source it sooner than later. It's usable even without that at the moment.

Unlike LevelDB, Medea stores key-value pairs unsorted. I'm planning to add sorted index support using map-reduce operations sometime in the future. That should allow range queries and other goodies that are slightly better performance-wise.

Cheers.

/cc @dominictarr @rvagg @gedw99

rvagg commented 11 years ago

@kevinswiber if you get sorting working properly you could plug in as a backend to levelup. From 0.7 we'll have a 'db' option that lets you replace leveldown. See example here: https://github.com/rvagg/node-memdown which uses an abstract implementation of the leveldown API here: https://github.com/rvagg/node-abstract-leveldown so it's pretty easy as long as you can provide the basic operations + sorting. The only catch is that leveldown will still be intsalled, but perhaps we'll move to leveldown as a peerDependency one day.. perhaps.

dominictarr commented 11 years ago

@kevinswiber what advantage comes from not sorting? It seems to me that there is a significant advantage from keeping the immutable files sorted, it appears that the reason the riak people decided not to sort them was that due to the dynamo like architecture of riak, they would be unable to take advantage of sorting anyway.

what do you think?

kevinswiber commented 11 years ago

@dominictarr

I think the primary advantage is in a lack of code complexity. I believe that was Basho's angle. I had a week to do this, so lack of complexity was appealing. :)

FWIW, Basho is actually promoting LevelDB as a backend storage for Riak more than what I see for Bitcask.

That said, I'd love to branch Medea with LSM trees and SSTables for a comparison. I'll just have to do it on nights and weekends. :)

Sent from my iPhone

On Mar 18, 2013, at 8:37 PM, Dominic Tarr notifications@github.com wrote:

@kevinswiber https://github.com/kevinswiber what advantage comes from not sorting? It seems to me that there is a significant advantage from keeping the immutable files sorted, it appears that the reason the riak people decided not to sort them was that due to the dynamo like architecture of riak, they would be unable to take advantage of sorting anyway.

what do you think?

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-15091055 .

rvagg commented 11 years ago

Bitcask also puts everything into RAM too doesn't it? so you hit ceilings pretty quickly. LevelDB has a lot more flexibility around caching I think.

kevinswiber commented 11 years ago

The entire key set is in RAM. Keys in memory point to files and offsets for locating data values. The idea is to try to take advantage of the OS read ahead cache for the data files.

Sent from my iPhone

On Mar 18, 2013, at 9:09 PM, Rod Vagg notifications@github.com wrote:

Bitcask also puts everything into RAM too doesn't it? so you hit ceilings pretty quickly. LevelDB has a lot more flexibility around caching I think.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-15092030 .

kevinswiber commented 11 years ago

Gents,

For the sake of development speed, I started off using a JavaScript object as a hash table for key:file/offset pairs in memory. It turns out this hits V8 pretty intensely and slows everything down.

I switched it out for a red black tree, but it's actually slower (maybe my rb tree sucks, too).

Any suggestions for a JavaScript-only hash table in Node? Does something significant already exist? Also, a concern is that the hashing, inserts, resizing, etc. will be too CPU-intensive in JavaScript under heavy load. I'm hoping this isn't true. I'd love to keep it pure JavaScript for ease of deployment and future maintenance.

Thanks.

sandfox commented 11 years ago

@kevinswiber How does you red black compare with these ones https://github.com/scttnlsn/redblack.js and https://github.com/vadimg/js_bintrees ? In a previous project I turned my entire (kd) tree into a giant buffer (I knew the number of nodes and size ahead of time) to avoid the GC/Heap (it was about 2+ GB from hazy memory I think) and because I was just throwing data from the buffer straight out clients via a socket so had no need to parse it. It was crazy crazy fast to put it bluntly. I never had the time to finish it but planned to allow dynamically inserted nodes and implement resizing buffers. Hopefully this might be of some use to you.

kevinswiber commented 11 years ago

@sandfox Thanks for the tips. Redblack.js was quite a bit slower, and I couldn't get the js_bintrees RBTree to find any keys. Not sure what's going on there.

My RedBlackTree implementation might be all right. I'm not sure my assessment of the problem was correct. I don't think the V8 hash table is actually performing that poorly at the numbers I'm using.

The profiler shows a lot of time spent in ScavengeObject, which I believe is used by the GC. Maybe I'll play with the bufferizing trick you mentioned.

Thanks!

ghost commented 11 years ago

you can turn off garbage collection in v8 i think. Or at least control it in some way.

On 20 March 2013 16:47, Kevin Swiber notifications@github.com wrote:

@sandfox https://github.com/sandfox Thanks for the tips. Redblack.js was quite a bit slower, and I couldn't get the js_bintrees RBTree to find any keys. Not sure what's going on there.

My RedBlackTree implementation might be all right. I'm not sure my assessment of the problem was correct. I don't think the V8 hash table is actually performing that poorly at the numbers I'm using.

The profiler shows a lot of time spent in ScavengeObject, which I believe is used by the GC. Maybe I'll play with the bufferizing trick you mentioned.

Thanks!

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-15183801 .

Contact details: +49 1573 693 8595 (germany) +46 73 364 67 96 (sweden) skype: gedw99

kevinswiber commented 11 years ago

Everyone: Thanks for the great discussion around this. I just pushed v0.0.2 to NPM. @juliangruber has a great repo showing benchmark comparisons. Thanks to his recent work, Medea just made the list: https://github.com/juliangruber/multilevel-bench

Snippet:

                      Medea (10.000x)
          12,324 op/s ⨠ set small
          12,313 op/s ⨠ set medium
          12,248 op/s ⨠ set large
          40,566 op/s ⨠ get large
          44,246 op/s ⨠ get medium
          45,174 op/s ⨠ get small

Here are some numbers from my machine: Intel Core i5, 8GB DDR3 RAM, Node v0.10.1:

» node ./example/bench.js 
set small >>
    15867.98 ops/s
set medium >>
    15048.91 ops/s
set large >>
    14178.36 ops/s
get large >>
    51440.33 ops/s
get medium >>
    56915.2 ops/s
get small >>
    63613.23 ops/s

I'm sure all these numbers will change over time, but I'm pretty happy with this for now. It's good enough to fit my use case (Web caching).

Cheers!

rvagg commented 11 years ago

Check this out peeps

LevelDOWN on Windows

see windows branch on LevelDOWN and current rvagg/leveldb libuv-port branch.

No snappy yet but I don't imagine that to be a problem. This also requires Node 0.10+ cause uv_cond_var is only new. I did have a 1/2 started backport of just those bits so it would be compatible with Node 0.8+ but I'm not sure I'll bother. I might target Node 0.10+ for Windows and Node 0.8+ for everyone else.

Lots of cleanup to do and I have to make sure the compile works on non-Windows platforms, but we're getting close.

Distributing binaries is a whole different issue but we can think about that later.

ghost commented 11 years ago

Thanks. Great stuff

Ged On 24/03/2013 6:25 AM, "Rod Vagg" notifications@github.com wrote:

Check this out peeps

[image: LevelDOWN on Windows]https://pbs.twimg.com/media/BGGSc9zCIAAfvhs.png:large

see windows branch on LevelDOWN and current rvagg/leveldb libuv-port branch.

No snappy yet but I don't imagine that to be a problem. This also requires Node 0.10+ cause uv_cond_var is only new. I did have a 1/2 started backport of just those bits so it would be compatible with Node 0.8+ but I'm not sure I'll bother. I might target Node 0.10+ for Windows and Node 0.8+ for everyone else.

Lots of cleanup to do and I have to make sure the compile works on non-Windows platforms, but we're getting close.

Distributing binaries is a whole different issue but we can think about that later.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-15351752 .

kevinswiber commented 11 years ago

@rvagg Excellent! I might fire up my dusty Windows 7 VM and check it out. How does the performance compare running on libuv vs LevelDB's built-in POSIX support? I'd be interested in seeing some metrics.

This is great.

ghost commented 11 years ago

Ragg. Can you give us a download link where we can get the binary. Want to try too :) On 24/03/2013 7:06 PM, "Kevin Swiber" notifications@github.com wrote:

@rvagg https://github.com/rvagg Excellent! I might fire up my dusty Windows 7 VM and check it out. How does the performance compare running on libuv vs LevelDB's built-in POSIX support? I'd be interested in seeing some metrics.

This is great.

— Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/5#issuecomment-15365004 .

rvagg commented 11 years ago

@kevinswiber I have no idea about performance, I'd love someone to try and come up with some benchmarks. The bits that libuv are doing are pretty simple and are just providing a wrapper around some thread stuff. The stuff that will matter more is code I've pulled in from here. You can have a look at it here. There's some things in there that I'd rather bind to libuv but that can wait for another day. It's also possible that there are things in there that could be done more efficiently but I haven't spent much time looking over it. I'd be happy for anyone else that wants to contribute to have a look over it!

My suspicion from seeing the speed of the tests is that it's not going to live up to posix performance, but I may just be observing process start/top and file create/delete overhead. It's also possible that my Window VM is having trouble performing well (and I'm not running it on an SSD, which I normally use on my Linux machines).

@gedw99 I don't have binaries for you, I'm thinking that post-release I may be able to produce 64 and 32 bit bundles for download that you can put in your node_modules directory, as a stop-gap. But not yet. I recommend that you get set up for running node-gyp on your machine then you'll be able to compile other native addons. The components are all free but it can be a bit of a pain on 64 bit machines. The details are all here, there's nothing particularly difficult, it's just annoying.

rvagg commented 11 years ago

LevelUP@0.7.0 now uses LevelDOWN@0.2.x which has Windows support. See the top of the README for instructions on what you need in order for it to compile when you npm install it. I'm going to close this issue for now, feel free to keep chatting but if you have anything specific about the Windows version then please open a new issue for it.

ghost commented 11 years ago

awesome stuff !

Level / levelup

Windows support #5