codediodeio / firestore-migrator

:bullettrain_side: A CLI utility for moving data to and from Cloud Firestore
https://angularfirebase.com/lessons/import-csv-json-or-excel-to-firestore/
317 stars 94 forks source link

Add support for JSON objects, not just JSON arrays #2

Closed jaufgang closed 6 years ago

jaufgang commented 6 years ago

This looks like a cool and useful utility.

If you are using JSON instead of CSV or XLS files though, and if you want to specify the document IDs instead of having them auto-generated, it would seem more elegant to use a JSON object structured as a dictionary of sub-objects, where the top level keys become document IDs instead of a JSON array and then needing to use the --id [id] switch.

Adding support for JSON objects and keeping the existing functionality for arrays would give the user flexibility to choose which approach to take.

codediodeio commented 6 years ago

@jaufgang I like that idea. Maybe we could add an --as-object flag that loops over the object keys

jaufgang commented 6 years ago

yeah, but I'm not sure that's necessary. You could simply parse the JSON and detect if the returned value is an array or object.

codediodeio commented 6 years ago

:+1: Good call. We'll just need to document that to make it clear. Feel free to send a PR, otherwise I will try to tackle this in a few days.

stildalf commented 6 years ago

@jaufgang a file structured like this?

{
  "11111": {
    "first_name": "Wendy",
    "last_name": "Phisher",
    "age": 22
  },
  "22222": {
    "first_name": "Magen",
    "last_name": "Bagelson",
    "age": 27
  },
  "33333": {
    "first_name": "Doug",
    "last_name": "Muffty",
    "age": 55
  },
  "44444": {
    "first_name": "Bob",
    "last_name": "Jones",
    "age": 31
  }
}
stildalf commented 6 years ago

Another option is to provide a means of stripping the used id field from the inserted doc, and keeping the array. Still, would be nice to support both arrays and keyed objects. I'll see if I can conjure up a PR.

codediodeio commented 6 years ago

This got me thinking... What would be really cool is if you could define a schema as a TS interface or class, then enforce that schema on import to strip unused data and set defaults.

That would also open the possibility to create commands like ... migrate collection --schema=my-schema.ts to rename/reset properties across every document in a collection - just like a SQL migration.

Maybe we should think about expanding this tool to provide a variety of utils.

jaufgang commented 6 years ago

@stildalf , yes that's exactly what I mean. A file structured like that would import as 4 docs with ids 11111, 22222, etc

jaufgang commented 6 years ago

@codediodeio, I don't have time to create a PR for this right now, but I have a bunch of ideas and would be happy to contribute to this in the future when I can.

If you are interested in expanding this into a more generally useful too, here's another related idea: Allow for importing nested json data optionally as subcollections.

Here's one example of how to do it:

for a .json file like this:

{
   "11111":{
      "first_name":"Wendy",
      "last_name":"Phisher",
      "age":22,
      "posts":[
         {
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"hello"
         },
         {
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"goodbye"
         }
      ]
   },
   "22222":{
      "first_name":"Magen",
      "last_name":"Bagelson",
      "age":27,
      "collection:posts":[
         {
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"hello"
         },
         {
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"goodbye"
         }
      ]
   },
   "33333":{
      "first_name":"Doug",
      "last_name":"Muffty",
      "age":55,
      "collection:posts":{
         "aaaa":{
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"hello"
         },
         "bbbb":{
            "timestamp":"2018-03-30T14:01:03.652Z",
            "text":"goodbye"
         }
      }
   }
}

Each of the 3 top level items has a set of posts, but for items 22222 and 33333, the properties have a collection:* prefix. This would cause the posts to be added as a subcollection under each of those docs. In the case of 22222 the posts are an array, so they would be added with auto generated IDs, but since 33333 has an object for posts, they would be added with IDs from the keys aaaa and bbbb.

You could even have a --subcollection-prefix [prefix] switch to specify what prefix would indicate a subcollection.

stildalf commented 6 years ago

Well, I've pushed a PR with your suggestions @jaufgang. JSON Objects & nested sub-collections.

@codediodeio, shout if this scope is creeping too far from your original sceencast intentions. I'd still like to investigate your schema/migration ideas, care to expand on them when you get a chance.

jaufgang commented 6 years ago

Wow, that's awesome! Wasn't expecting these ideas to be implemented in the blink of an eye. :)

codediodeio commented 6 years ago

Thanks @stildalf for your hard work. I'll close this issue, but please open more issues if you have additional ideas