datafaker-net / datafaker

Generating fake data for the JVM (Java, Kotlin, Groovy) has never been easier!
https://www.datafaker.net
Apache License 2.0
1.14k stars 159 forks source link

Provide Stream output for transformers #1176

Open snuyanzin opened 4 months ago

snuyanzin commented 4 months ago

The problem with current method like net.datafaker.transformations.JsonTransformer#generate(net.datafaker.transformations.Schema<IN,?>, int) it generates the whole String and then returns it. As a result for bigger numbers it consumes larger amount of memory and e.g. such test fails with OutOfMemory

@Test
    void test2() {
        BaseFaker faker = new BaseFaker(new Random(10L));
        Schema<Object, ?> schema = Schema.of(
            field("Text", () -> faker.name().firstName()),
            field("Bool", () -> faker.name().lastName())
        );

        JsonTransformer<Object> transformer = JsonTransformer.builder().build();
        String json = transformer.generate(schema, 50_000_000);
        System.out.println(json);
    }

There is not so much we can do about this method since anyway with such approach we need somehow to store that giant string value.

Another approach is instead of generation the final string value we could generate a stream of values and return it.

snuyanzin commented 3 months ago

partially covered with https://github.com/datafaker-net/datafaker/pull/1177 however not all formats are supported yet

RVRhub commented 3 months ago

Do I understand correctly that this proposal suggests implementing a solution similar to the one used for JsonTransformer, but also for:

snuyanzin commented 3 months ago

yes you are right

gatear commented 3 months ago

My PR covers the SQL transformer support https://github.com/datafaker-net/datafaker/pull/1264

gatear commented 1 month ago

Also a PR for JavaObjectTransformer https://github.com/datafaker-net/datafaker/pull/1313