eclipse-ee4j / yasson

Eclipse Yasson project
https://projects.eclipse.org/projects/ee4j.yasson
Other
204 stars 96 forks source link

Multithreading leads to file corruption #637

Open bmarwell opened 10 months ago

bmarwell commented 10 months ago

Describe the bug

When writing JSON files in parallel with multiple threads (>>100), JSON files are very likely (>80%) to get corrupted like so:

{ 
  "results": [
    { "key": "val]}
}

Sometimes even data from other files is included randomly, again invalidating the JSON file. This does not happen with FasterXML Jackson nor with Apache Johnzon (the latter also implementing Jakarta JSON-B).

To Reproduce

Here's working code with Jackson:

public class PatchJsonWriter implements AutoCloseable {

    private final Path outputDir;
    private final ExecutorService executorService;

    private final ObjectMapper om;

    public PatchJsonWriter(Path outputDir, ExecutorService executorService) {
        this.outputDir = outputDir;
        this.executorService = executorService;

        this.om = new ObjectMapper().findAndRegisterModules().setDefaultPrettyPrinter(new MinimalPrettyPrinter());
    }

    public void writeParallel(Collection<PatchParameterInfo> parameterInfos) {
        List<CompletableFuture<Void>> futures =
                parameterInfos.stream().map(this::writeAsync).toList();

        CompletableFuture.allOf(futures.toArray(CompletableFuture[]::new)).join();
    }

    private CompletableFuture<Void> writeAsync(PatchParameterInfo patchParameterInfo) {
        return CompletableFuture.runAsync(() -> doWritePatchParameterInfo(patchParameterInfo), this.executorService);
    }

    private void doWritePatchParameterInfo(PatchParameterInfo patchParameterInfo) {
        Path path = this.outputDir
                .resolve(patchParameterInfo.groupName())
                .resolve(patchParameterInfo.appNameName())
                .resolve(patchParameterInfo.version());
        Path file = path.resolve("patch.json");

        try {
            Files.createDirectories(path);
        } catch (IOException ioException) {
            throw new UncheckedIOException(ioException);
        }

        Logger logger = Logger.getLogger(PatchJsonWriter.class.getName());
        PatchParameterWrapper parameterWrapper = PatchParameterWrapper.from(patchParameterInfo.patchParameter());
        logger.info(() -> String.format(Locale.ROOT, "Writing file [%s] with content [%s]", file, parameterWrapper));

        try (OutputStream fos = Files.newOutputStream(
                file, StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING)) {
            this.om.writeValue(fos, parameterWrapper);
        } catch (IOException ioException) {
            throw new UncheckedIOException(ioException);
        } catch (Exception ex) {
            throw new RuntimeException("Problem creating Jsonb instance", ex);
        }
    }

    @Override
    public void close() {}
}

This is the version I created with the latest Yasson version (3.0.4 -- where are the release notes btw?):

public class PatchJsonWriter implements AutoCloseable {

    private final Path outputDir;
    private final ExecutorService executorService;
    private final Jsonb jsonb;

    public PatchJsonWriter(Path outputDir, ExecutorService executorService) {
        this.outputDir = outputDir;
        this.executorService = executorService;

        this.jsonb = JsonbFactory.create();
    }

    // omit sne methods

    private void doWritePatchParameterInfo(PatchParameterInfo patchParameterInfo) {
        // ... same

        try (OutputStream fos = Files.newOutputStream(
                file, StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING)) {
            this.jsonb.toJson(PatchParameterWrapper.from(patchParameterInfo.patchParameter()), fos);
        } catch (IOException ioException) {
            throw new UncheckedIOException(ioException);
        }
    }

    @Override
    public void close() throws IOException {
        try {
            this.jsonb.close();
        } catch (IOException ioException) {
            throw ioException;
        } catch (Exception other) {
            throw new UnsupportedOperationException(other);
        }
    }
}

Using this code, some files are corrupted like shown above:

{
  "result": [
    {
      "name": "app1.service.url"
    },
    {
      "name": "app1.service.ttl"
    },
    {
      "nawr": "app1.jwt.expiration"
    },
    {
      "namwrc."app1.ldap.searchbase"
        },
            "name"ot"app1.ldap.selfservice_searchba,
      n"  "erap.alias"
        },
        {
            "name": "otherappport"
        }
    ]
} "app1.jwt.encryptionKeySource"
        },
        }
    ]
}

The Jackson variant does not suffer from this behaviour. Yasson is mixing in contents from another file into this example file

Expected behaviour

System information:

Additional context

rdehuyss commented 9 months ago

We from JobRunr have the same issue. We are using Yasson as an option to generate JSON from some objects and sometimes it generates invalid JSON - this is with really limited parallelism (about 2 simultaneous requests).

The invalid JSON:

[{"deleteSucceededJobsAfter":129600.000000000,"firstHeartbeat":"2024-01-16T08:38:09.971713Z","id":"6e836e6f-3bea-4dc8-9d90-804ce2d3f76c","lastHeartbeat":"2024-01-16T08:38:39.994807Z","name":"Ismailas-MacBook-Pro.local","permanent,{"type":"severe-jobrunr-exception"0000000,"pollIntervalInSeconds":15,"processAllocatedMemory":18048136,"processCpuLoad":0.0020519469633203924,"processFreeMemory":17161821048,"processMaxMemory":17179869184,"running":true,"systemCpuLoad":0.2227047146401985,"systemFreeMemory":20553973760,"systemTotalMemory":68719476736,"workerPoolSize":96}]

Notice the permanent where invalid json is generated. Below is the valid json:

[{"deleteSucceededJobsAfter":129600.000000000,"firstHeartbeat":"2024-01-16T08:38:09.971713Z","id":"6e836e6f-3bea-4dc8-9d90-804ce2d3f76c","lastHeartbeat":"2024-01-16T08:41:40.032196Z","name":"Ismailas-MacBook-Pro.local","permanentlyDeleteDeletedJobsAfter":259200.000000000,"pollIntervalInSeconds":15,"processAllocatedMemory":48091728,"processCpuLoad":4.15962064712972E-4,"processFreeMemory":17131777456,"processMaxMemory":17179869184,"running":true,"systemCpuLoad":0.17733182589033353,"systemFreeMemory":20528300032,"systemTotalMemory":68719476736,"workerPoolSize":96}]
bmarwell commented 5 months ago

@rdehuyss for me I found that I was actually writing to the same file twice. Switching to Apache Johnzon only fixed it because it was way faster. For me, this is closed. But I can leave it open for you.

rdehuyss commented 5 months ago

@bmarwell : these were two different Strings (so not even files).

Verdent commented 1 month ago

Hello @bmarwell and @rdehuyss , I have tried to reproduce this and no luck so far.

Would you please be so kind and try to verify, if this still happens with the latest release?

bmarwell commented 1 month ago

No I was not able to reproduce it. Maybe it was overlapping writing to the same files after all?