Open gitsathish opened 4 years ago
Hi, thanks for reaching out. This can be achieved via the use of inSet (see the User Guide or example for further info), like this:
[tim@sn1 bin]$ cat profiles/cardinality.json
{
"fields": [
{
"name": "an_integer",
"type": "integer",
"nullable": false
}
],
"constraints": [
{
"field": "an_integer",
"inSet": "integer_set.csv"
}
]
}
[tim@sn1 bin]$ cat profiles/integer_set.csv
1
25000
10
1000
[tim@sn1 bin]$ ./datahelix --profile-file=profiles/cardinality.json --max-rows 3 --quiet
an_integer
25000
1000
1
Would this approach work for you? This would work for any data type.
Feature request
Wondering if there is a way to do this, impose cardinalities on columns. Example, Generate 10000 rows with an Integer column. Integer column min,max is 1 and 25000. But there should only be 100 unique values of the integer in the 10000 rows.
Similar, functionality for String would be useful as well.