k1LoW / tbls

tbls is a CI-Friendly tool for document a database, written in Go.
MIT License
3.32k stars 164 forks source link

feat: [MongoDB] Support multiple type field #540

Closed mrtc0 closed 7 months ago

mrtc0 commented 7 months ago

Issue Context / why we need it

MongoDB is schema-less, so field may have different types. When dealing with such databases, the $sample stage may produce different results with each execution.

For example:

$ cat init/init.js
db.createCollection('tblstest');
db.tblstest.insertMany(
  [
    { name: 'test1', age: 1 },
    { name: 'test2', age: 2 },
    { name: 'test3', age: 3 },
    { name: 'test4', age: 4 },
    { name: 'test5', age: null },
    { name: 'test6', age: null },
    { name: 'test7', age: null },
    { name: 'test8', age: null },
    { name: 'test9', age: 9 },
    { name: 'test10', age: 10 },
  ]
)

$ docker run --rm -it -p 27017:27017 -v $PWD/init:/docker-entrypoint-initdb.d mongo:7.0.3

$ cat .tbls.yml
dsn: mongodb://localhost:27017/test?sampleSize=10
docPath: ./docs/schemas/mongodb
er:
  skip: true

# Repeat multiple times...
$ tbls doc --rm-dist
...

$ git diff
diff --git a/docs/schemas/mongodb/test.tblstest.md b/docs/schemas/mongodb/test.tblstest.md
index d5987f7..881552b 100644
--- a/docs/schemas/mongodb/test.tblstest.md
+++ b/docs/schemas/mongodb/test.tblstest.md
@@ -9,7 +9,7 @@ Count of documents is 10
 | Name | Type | Default | Nullable | Occurrences | Percents | Children | Parents | Comment |
 | ---- | ---- | ------- | -------- | ----------- | -------- | -------- | ------- | ------- |
 | _id | objectId |  | false | 5 | 100.0 |  |  |  |
-| age | int32 |  | false | 5 | 100.0 |  |  |  |
+| age | <nil> |  | false | 5 | 100.0 |  |  |  |
 | name | string |  | false | 5 | 100.0 |  |  |  |

I think it's bothersome to have differences with each execution, so it would be preferable to either suppress this or output all types of the aggregated documents.

What this PR does

From the documents selected in the $sample stage, all types of the fields will be enumerated with commas separated.
This can be enabled with the multipleFieldType query.

For example:

$ cat .tbls.yml
dsn: mongodb://localhost:27019/test?sampleSize=10&multipleFieldType=true
docPath: ./docs/schemas/mongodb
er:
  skip: true

$ tbls doc --rm-dist
$ git diff
diff --git a/docs/schemas/mongodb/test.tblstest.md b/docs/schemas/mongodb/test.tblstest.md
index d5987f7..9e30dda 100644
--- a/docs/schemas/mongodb/test.tblstest.md
+++ b/docs/schemas/mongodb/test.tblstest.md
@@ -8,9 +8,9 @@ Count of documents is 10

 | Name | Type | Default | Nullable | Occurrences | Percents | Children | Parents | Comment |
 | ---- | ---- | ------- | -------- | ----------- | -------- | -------- | ------- | ------- |
-| _id | objectId |  | false | 5 | 100.0 |  |  |  |
-| age | int32 |  | false | 5 | 100.0 |  |  |  |
-| name | string |  | false | 5 | 100.0 |  |  |  |
+| _id | objectId |  | false | 10 | 100.0 |  |  |  |
+| age | <nil>,int32 |  | false | 10 | 100.0 |  |  |  | # 👈 
+| name | string |  | false | 10 | 100.0 |  |  |  |

As one solution, I've implemented this feature, but holding the types in a comma-separated format might be something we need to discuss. I would like to hear your opinion.

k1LoW commented 7 months ago

@mrtc0 Thank you for your GREAT proposal.

At this time, Type is not a multi-valued structure, so I think the comma delimitation is appropriate.

k1LoW commented 7 months ago

Thank you!