steadyequipment / node-firestore-backup

Google Firebase Firestore backup tool
190 stars 51 forks source link

Ability to restore backup #2

Open yoiang opened 6 years ago

yoiang commented 6 years ago

It'd be great to go both ways. Would preferably allow specifying overwriting and merging rules on documents and collections separately.

Slavrix commented 6 years ago

An issue with the current backup form is that it is currently JSON.stringifying all of the data in the document.

This causes issues with keeping the dataType of each field.

Primarily with timestamps and geoPoints, references and nullTyped fields. Arrays and Maps also could run into issues based on how deep they are as they will need to be checked for these values also to ensure the data is saved correctly.

Here is a demo document that I put all the types in so we could see the values. To get this object, I used this snippet

    firestore.collection(theCollection).get()
    .then((snapshots)=> {
        if(!snapshots.empty) {
            snapshots.forEach((document)=> {
                console.log(document._fieldsProto);
            }) 
        }
    });
...
{
 teamName: { stringValue: 'Under 9s', valueType: 'stringValue' },
  arrrr: { arrayValue: { values: [Array] }, valueType: 'arrayValue' },
  team: { stringValue: 'U09', valueType: 'stringValue' },
  teamLimits: { stringValue: '16', valueType: 'stringValue' },
  null: { nullValue: 'NULL_VALUE', valueType: 'nullValue' },
  maxAge: { integerValue: '9', valueType: 'integerValue' },
  obj: { mapValue: { fields: [Object] }, valueType: 'mapValue' },
  created: 
   { timestampValue: { seconds: '1510025554', nanos: 424000000 },
     valueType: 'timestampValue' },
  geo: 
   { geoPointValue: { latitude: -34.202716, longitude: 151.171875 },
     valueType: 'geoPointValue' }
 }
...

saving this to the files instead of just pure JSON of the document.data() allows for easier retention of the data types.

These can then be handled in a restore function much easier as an action can be taken on the object based on the type to ensure that the correct data type is retained on write.

I am yet to find if it is possible just to take this and write it as is with document.set or document.add or anything like that.

Hope this helps in some way.

Slavrix commented 6 years ago

doesn't seem to be a way to just add a document using the above format.

However, looking through the firestore package. @google-cloud/firestore/src/doucments.json line 402 we have a method that is used to decode the above _fieldsProto to which is returned by documents.data()

It's private so can't be called but thats our switch to use to decode the above format to allow type retention.

  /**
   * Decodes a single Firestore 'Value' Protobuf.
   *
   * @private
   * @param proto - A Firestore 'Value' Protobuf.
   * @returns {*} The converted JS type.
   */
  _decodeValue(proto) {
    switch (proto.valueType) {
      case 'stringValue': {
        return proto.stringValue;
      }
      case 'booleanValue': {
        return proto.booleanValue;
      }
      case 'integerValue': {
        return parseInt(proto.integerValue, 10);
      }
      case 'doubleValue': {
        return parseFloat(proto.doubleValue, 10);
      }
      case 'timestampValue': {
        return new Date(
          proto.timestampValue.seconds * 1000 +
            proto.timestampValue.nanos / MS_TO_NANOS
        );
      }
      case 'referenceValue': {
        return new DocumentReference(
          this.ref.firestore,
          ResourcePath.fromSlashSeparatedString(proto.referenceValue)
        );
      }
      case 'arrayValue': {
        let array = [];
        for (let i = 0; i < proto.arrayValue.values.length; ++i) {
          array.push(this._decodeValue(proto.arrayValue.values[i]));
        }
        return array;
      }
      case 'nullValue': {
        return null;
      }
      case 'mapValue': {
        let obj = {};
        let fields = proto.mapValue.fields;

        for (let prop in fields) {
          if (fields.hasOwnProperty(prop)) {
            obj[prop] = this._decodeValue(fields[prop]);
          }
        }

        return obj;
      }
      case 'geoPointValue': {
        return GeoPoint.fromProto(proto.geoPointValue);
      }
      case 'bytesValue': {
        return proto.bytesValue;
      }
      default: {
        throw new Error(
          'Cannot decode type from Firestore Value: ' + JSON.stringify(proto)
        );
      }
    }
  }
Slavrix commented 6 years ago

Reference data types need more work still though.

the reference can then be reimported through let reference_to_import = firebase.doc(the_saved_reference_string)

the depth of the string needs to be checked through and will need to be extracted from the saved string. 'projects/{PROJECT_NAME}/databases/(default)/documents/{COLLECTION_NAME}/{DOC_ID}'

to get the reference, we only need the collection and doc id and anything after that for depth, IF it is in the same database. currently, I don't believe its possible to reference ones in other databases but the beginning section will probably be used for that in the future in some way.

So need to somehow do a split here that will remove the beginning part from it, the best way I can think of is to somehow get the beginning part of the string from the initialised firestore object and then matching and then removing the beginning part that way, or something.

Slavrix commented 6 years ago

its janky but it works

        let toRemove = 'projects/'+firestore._referencePath._projectId+'/databases/'+firestore._referencePath._databaseId+'/documents/';

        let toFind = proto.referenceValue.replace(toRemove, '');
        console.log(toFind);
        return firestore.doc(toFind);

in the referenceValue section of the switch

Slavrix commented 6 years ago

Hopefully, this helps if you guys haven't already got restoring working better. there is probably a better way to do it, but this works for me for now.

yoiang commented 6 years ago

@Slavrix !! Thank you for all this research! This is way farther than I've gotten in thinking about it!

Hmmmm, off the top of the skull the switch you wrote will likely have to be the direction we go.

Running through other possibilities that come to mind: an assumption would be that anything received from the server, if somehow maintained in its exact form, can be pushed back safely. To that if we could preserve all information about the object we should be able to write each document object's data to a binary file rather than JSON. However, because some of these data types are functional objects and classes and Javascript being a modern language, information composing said object or class are likely contained elsewhere and referred to. In that way it would be an incomplete copy we write to disk.

Slavrix commented 6 years ago

Yea that switch itself came from the official node firebase package with just the one tweak for reference files. I figured if that's how they are doing it it's probably the correct way to go also.

If there is a way to add an entire documentRef straight to firestore that would be the best as then we can retain the rest of the metadata also, but I haven't been able to see one yet.

lpellegr commented 6 years ago

This project is a really great initiative. However, be careful.

It seems you do not provide any restoration mechanism and you claim on the front page:

Relax: That's it! :sparkles::rainbow:

Are you joking? Maybe I missed something but as a first sight, it looks like a business killer! A backup that cannot be restored or has not be tested as restorable is just the same as no backup...

yoiang commented 6 years ago

Hey @lpellegr , thank you for the input!

I don't think this is the best forum for a discussion of the merits or usefulness of the project, especially in an issue where we are discussing how to accomplish exactly what you feel is lacking 😉 Feel free to reach out to me directly if you would like to further discuss!

And please contribute to the conversation or submit a PR, we can use the help!

yoiang commented 6 years ago

@s-shiva1995 created a PR #18 that enumerates most of the types, similar to what you described @Slavrix, I just got a chance to give it a look and it looks like a good start!

jeremylorino commented 6 years ago

Began the basics of the restore functionality. Luckily @s-shiva1995 set us up nicely with the backup format.

Going to approach this the same way as the backup.

  1. Serial. No merging; will write over existing docs if they exist
  2. Retry options to allow a partial or full retry if documents fail. (Prerequisite for handling failed sub collection restores)
  3. Option to drop all data before restore (ensure no orphaned sub collections)
  4. Then approach the merge restore
  5. Then a parallel restore
  6. Then this will lead into using Pub/Sub and cloud functions for the restore (need to mind the Firestore API quota when dealing with large datasets in async mode)
stewones commented 6 years ago

the funniest part is that google that should provide this =D

jeremylorino commented 6 years ago

stewwan do you mean that this is currently a feature they provide for Firestore? Or that this is something they should provide?

stewones commented 6 years ago

@jeremylorino no no, I meant that is something they should provide. Not sure if they plan to release something, what is funny, a Google database as a service, without backup tool. lol.

jeremylorino commented 6 years ago

@stewwan I figured that is what you meant ;)

My personal opinion is that if I have time to build on top of any company's to bridge a gap then I will. Give help; get help.

That being said Firestore is a pre-GA product. Which ultimately means there will be features that may seem like must-haves that are not yet included.

FYI I have not heard of the current priority of this feature in any public Google channel

stewones commented 6 years ago

yeah makes sense. I built an app that is running in production with thousands of records, I'm afraid to do something stupid when running database migrations, so I'll end up having to write some logic to restore data.. I'll try to PR something as soon as I can.

elitan commented 6 years ago

Google will most probably add backup/restore functionality in the future. But for now, this is the best we got.

jeremylorino commented 6 years ago

FYI There is no indication that the Firebase team will have this functionality in 2018

@yoiang going to PR this feature but i'm going to slap some tests around some of this repo. beginning to actually get a fair amount of code piling up. wanna get some tests in before it gets to heavy and we end up in a test hole.

yoiang commented 6 years ago

Let me know if I can help out on this PR, it's an important one

jeremylorino commented 6 years ago

@yoiang I am not very confident in putting a good suite of tests around the current repo using flow and babel. I converted to typescript and started a few tests. Before I go any further what are your thoughts?

https://github.com/now-ims/node-firestore-backup/tree/to-typescript

yoiang commented 6 years ago

Hey @jeremylorino ! Switching over to Typescript is a bit out of scope, and comes with its new set of pluses and minuses vs Javascript + Flowtype. I feel if other consumers and contributors feel strongly about Typescript we should discuss it but it should be a decision made independent of these features. It would be great to hear your concerns!

jeremylorino commented 6 years ago

@yoiang totally agree; which is why i went ahead dropped yall a line to get everyone's temperature here.

The largest factor that led me to doing a Typescript conversion was the fact that in my "day job" we run all Typescript so my toolchain setup for slapping together some unit tests was already done. I normally run fairly tight tslint rules to make sure my code walks a straight line and allows me to breakdown the pieces for testing.

not to mention the Firebase team is heavy Typescript for their Node.js projects

What toolchain are you using for dev of this repo? I am completely spoiled with vscode and all their fancy extensions.

atlanteh commented 6 years ago

Any update on this?

lpellegr commented 6 years ago

@atlanteh Restoration works perfect with https://www.npmjs.com/package/firestore-backup-restore

atlanteh commented 6 years ago

Oh great! Is that project deliberately separated from this one? Or just waiting some pr?

jeremylorino commented 6 years ago

While talking with the Firebase team at Next, this feature is coming in Firestore GA.

They are in the process of merging Firestore and Datastore.

On Tue, Jul 24, 2018, 8:14 AM atlanteh notifications@github.com wrote:

Oh great! Is that project deliberately separated from this one? Or just waiting some pr?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/steadyequipment/node-firestore-backup/issues/2#issuecomment-407401921, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyA3loFKoTwCeVqiG2D-07zpwVtShQLks5uJx3GgaJpZM4QQHhc .

-- google is watching

atlanteh commented 6 years ago

Ohh greatt! And when will that be? Or they don't disclose that yet?

steve8708 commented 6 years ago

Looks like the official import/export tool for firestore was just released moments ago today! 🎉

Docs - Announcement

jeremylorino commented 6 years ago

Great work everyone. I know the Firebase team appreciates the work that was put into this repo as well as the feedback.

Ta-da!!

On Wed, Aug 8, 2018, 1:26 PM Steve Sewell notifications@github.com wrote:

Looks like the official import/export tool for firestore was just released moments ago today! 🎉

Docs https://firebase.google.com/docs/firestore/manage-data/export-import - Announcement https://firebase.googleblog.com/2018/08/more-cloud-firestore-improvements.html

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/steadyequipment/node-firestore-backup/issues/2#issuecomment-411505185, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyA3s-XfTkn2dDix8-icWtfLmemdTqHks5uOy08gaJpZM4QQHhc .

-- google is watching

rupertbulquerin commented 5 years ago

Help on installing firestore-backup-restore

npm ERR! @google-cloud/firestore@0.19.0 compile: tsc -p . && cp -r dev/protos build && cp -r dev/test/fake-certificate.json build/test/fake-certificate.json && cp dev/src/v1beta1/firestore_client_config.json build/src/v1beta1/ && cp dev/conformance/test-definition.proto build/conformance && cp dev/conformance/test-suite.binproto build/conformance