Open pcouture opened 7 years ago
This library is EXTREMELY unreliable.
Agreed -- I have recently been running into a lot of issues with an old project that still uses this library. I've been searching for a replacement, but I am very wary of a lot of these legacy S3 implementations floating in npm that use knox or raw http or any other hand-rolled solutions.
(If anyone knows of a module that implements the "sync this folder to S3" functionality, built directly on top of the official aws-sdk node module, let me know!)
@elliot-nelson be the change you want to see ;) I don't use this module anymore but if you're looking for it, go for it!
I think syncing a directory is deceptively complex. It's a bit like deep-copying a js object, there's no single go-to lib or implementation because while it sounds really simple, it means different things in different contexts, and each use case will require special treatment.
using the native lib you can readdirp(path).on('file', (file) => s3.putObject({ ..., Body: createReadStream(file.path)})
as a starting point, and deal with your specific requirements or edge cases from there.
The problem with creating a lib to do this, is that there's so many edge cases you need to accommodate, the api for your lib would be more complex than the aws-sdk api.
I think the real problem is the AWS documentation. I'm sure I'm not the only one who ended up here because I've looked at the AWS sdk docs, and noped out of there thinking someone must have made a nice simple abstraction layer on npm. Granted I'm not a smart guy but trying to read their docs is "frustrating" to put it mildly... you're better off just trying to guess how it's supposed to work and then figure it out from the errors you throw (which are actually pretty concise).
I really like the aws-sdk library for node, actually, and have been toying half-heartedly with rolling my own sync. Like you said, though, actually uploading the files is the easy 5% of the problem. To have a working tool you are essentially copying all of what aws-cli
already provides -- specifying what files to sync via globs, specifying files to exclude, do you want to check MD5 sums before uploading a file, do you want to delete existing files that aren't in the local folder, don't forget that sums of files uploaded with multipart uploads are tricky and need to be calculated differently...
I guess what I really want is a complete nodejs port of aws-cli (which is written in python). But when I can install aws-cli on the target machine and just shell out to it from nodejs with 5 minutes of work, it's hard to justify the time commitment.
if you want a cli interface for s3 there's node-s3-cli (node based) it's a little bit of an oddball though in that it's a "drop in" replacement for s3cmd, which is a python cli interface available as a binary package in linux distros.. so you see it kicking around in backup scripts and things. seems stable & reliable though.
I actually switched my version from the latest to master and it's working a bit better, maybe we just need another release? it at least is syncing folders correctly now
BTW i switched to master jut to see and my issues mostly went away, have any of you guys tried that @leviwheatcroft @elliot-nelson @ryancole ? or did you find something else?
@acidjazz thanks for the heads up. We had issues overwriting existing content on S3 with version 4.4 as per NPM, but have tried pulling from master
directly and everything seems to work now!
Pulling direct from master
is risky in case of an API change, so we've locked into the latest working commit in our package.json
:
"s3": "https://github.com/andrewrk/node-s3-client.git#a1c25dd322bb22617a7255e84369b1e1c0289397"
this module isn't being updated and is starting to have issues working right. Recommend people use the AWS library native.