hex7c0 / mongodb-backup

backup data for mongodb for Nodejs
https://github.com/hex7c0/mongodb-backup
Apache License 2.0
134 stars 49 forks source link

Write tar files to a stream instead of to the filesystem first #22

Open spmiller opened 7 years ago

spmiller commented 7 years ago

This improves performance when writing to a stream (i.e. there is no need to wait for the contents to be written to disk first), and can help solve EMFILE errors. This is achieved by using tar-stream instead of regular tar.

This commit introduces a small shim in front of the file system and tar-stream to allow both direct disk writes and tar file stream writes. It then uses the appropriate shim (file system or tar-stream) throughout the codebase to create directories and store files.

Potentially fixes #20.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-71.4%) to 26.316% when pulling 1f85b758872167ddca2c6dd18fe5d7cef280b466 on spmiller:tar-streams into 20768db314c5a49eefd2a9e7028b2534772e1932 on hex7c0:1.6.

hex7c0 commented 7 years ago

Hi spmiller, thanks for your PR

can you change only index.js? for code review

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 82b80c6e74e47a06c0710942b1970fb48113713f on spmiller:tar-streams into 20768db314c5a49eefd2a9e7028b2534772e1932 on hex7c0:1.6.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 82b80c6e74e47a06c0710942b1970fb48113713f on spmiller:tar-streams into 20768db314c5a49eefd2a9e7028b2534772e1932 on hex7c0:1.6.

spmiller commented 7 years ago

@hex7c0 done. I'm not sure why coveralls is reporting the coverage has decreased so dramatically -- when I ran the tests locally coverage stayed at 97.74%.

Edit: coverage decreased because the tests can't run on a PR since the URI is defined with a secure variable.

spmiller commented 7 years ago

I realised the way I was injecting the document store made it difficult to see what was actually changing in the diff. I have now changed it to be stored as a global variable rather than injecting it into functions, which has made the diff much easier to digest.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 36b035eb27ae26fe5de2dc0ad20fd22f7e07bdf0 on spmiller:tar-streams into 20768db314c5a49eefd2a9e7028b2534772e1932 on hex7c0:1.6.

spmiller commented 7 years ago

Hi @hex7c0, is there anything I can do to help get this merged?

hex7c0 commented 7 years ago

I'll work on it soon I'm so sorry for the delay

mledwards commented 6 years ago

Any update on this? I'm getting the same issue with streams and fs. This would be much less of an issue if the plugin gzipped as that's tiny. My tar is coming out about 10 times bigger than the data.

Unfortunately for me it renders the module useless, as databases grow and grow and inevitably will be bigger than 100mb.

Otherwise, it's a great module, I'm using it for hourly backups.

spmiller commented 6 years ago

Hey @hex7c0, is there anything more you need me to do before merging this? Our project is starting to have issues backing up large-ish databases, so I may need to start using my fork :(

I am also planning a similar patch to the mongodb-restore library to support reading directly from the .tar.gz stream, but I wanted to get this one in first.

hex7c0 commented 6 years ago

hey @spmiller you are absolutely right!

I'll try to allocate some time to do it

cortezcristian commented 5 years ago

Patched it to work with Mongo 4.x https://github.com/spmiller/mongodb-backup/pull/5 here's the package https://www.npmjs.com/package/mongodb-backup-stream-4x

Phoscur commented 5 years ago

Whats the holdup on this? @hex7c0

tennox commented 5 years ago

Note that this seems to have a problem when the root directory does not exist (as it tries to create the tar file first):

mongoBackup({
  uri: Meteor.settings['mongoUri'],
  parser: 'json',
  root: 'dump/',
  logger: 'mongo.log',
  tar: 'dump.tar',
});

(STDERR) (node:11311) DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect. (STDERR) events.js:183 (STDERR) throw er; // Unhandled 'error' event (STDERR) ^ (STDERR) (STDERR) Error: ENOENT: no such file or directory, open '/home/manu/dev/my-app/.meteor/local/build/programs/server/dump/dump.tar'