datafaker-net / datafaker

Generating fake data for the JVM (Java, Kotlin, Groovy) has never been easier!
https://www.datafaker.net
Apache License 2.0
1.16k stars 160 forks source link

Unable to resolve expression during build #1306

Closed bodiam closed 1 month ago

bodiam commented 2 months ago

Hi all,

I have another build where an expression couldn't be resolved: https://github.com/datafaker-net/datafaker/actions/runs/9940728395/job/27458128814?pr=1305

I was hoping it was resolved in one of the earlier 2.3.x PRs, but it seems it's still happening, I had a few failures earlier today too.

If someone could have a look, that would be great!

asolntsev commented 2 months ago

I don't yet know what's the reason, but at least I know how to reproduce it. Just run this single test in IDEA, then open "Run Configurations" -> "Edit configurations" -> "Modify options" -> "Repeat" -> "Until failure". And run again.

Now test will be re-running endlessly, and sooner or later will fail with the expected error.

image image

snuyanzin commented 2 months ago

@RVRhub I remember you also mentioned some findings about this

asolntsev commented 2 months ago

@bodiam @snuyanzin I could reproduce the failure, but still don't understand why it happens. What I suggest: I can submit a PR that improves logging and adds debug logs at some places in DF. Then we can get more information from debug logs when it happens again.

kingthorin commented 2 months ago

I'm good with that 👆

Though I'm not exactly an authority for the project 😁

bodiam commented 2 months ago

I can submit a PR that improves logging and adds debug logs at some places in DF.

I don't mind that, but perhaps for narrowing down the problem here only? It's probably not something we need to merge right, the debugging can happen on a branch?

asolntsev commented 2 months ago

@bodiam It depends on the logging itself. But generally I think it's a good idea to add the logging to main branch. DEBUG logs are disabled by default, so it will not affect users. But when someone decides to investigate some specific problem, they can easily enable debug logs for the entire DF or some specific package.

bodiam commented 2 months ago

I've done some debugging, and this is what I see:

image

It's coming from here:

        Map<String[], MethodAndCoercedArgs> accessorMap =
            MAP_OF_METHOD_AND_COERCED_ARGS
                .getOrDefault(clazz, Collections.emptyMap())
                .getOrDefault(methodName, Collections.emptyMap());
        // value could be null
        if (accessorMap.containsKey(args)) {  <--- accessorMap is sometimes null
            return accessorMap.get(args);
        }

I don't think the accessorMap can be null under normal circumstances, but this is the COWMap, I'm not sure if that's related, but not sure how accessorMap can be null here (sometimes). When it's null, the nullpointer is swallowed by the above code when it's catching Exception.

asolntsev commented 2 months ago

Yes, this is one of my findings too. And proper logging would help to find this issue sooner. ;) But it doesn't seem to cause our initial problem. :(

bodiam commented 2 months ago

@asolntsev See screenshot. There is a log message exactly there: LOG.fine(e.getMessage()). That prints the nullpointer when you enable logging.

But I think it's related to the initial problem, since the above exception happens when some method returns null. In this case, null is returned when we get the NPE. Why do we get the NPE? I have no idea yet.

asolntsev commented 2 months ago

Why do we get the NPE?

As a rule, exceptions must be logged with stack trace: Log.log(FINE, e.getMessage(), e);

Then we would know why it happens. And I don't understand why this log is FINE, not ERROR or at least WARNING.

Anyway, I am working on it.

asolntsev commented 2 months ago

@bodiam @snuyanzin After some investigation, I came to a conclusion that this is a bug in CopyOnWriteMap, or at least its usage.

I reproduced this problem in a dedicated unit-tests here: https://github.com/datafaker-net/datafaker/pull/1310

P.S. This is how CopyOnWriteMap is used in FakeValuesService:

  final Map<String, Map<String[], MethodAndCoercedArgs>> stringMapMap =
    MAP_OF_METHOD_AND_COERCED_ARGS.computeIfAbsent(clazz, t -> new CopyOnWriteMap<>(WeakHashMap::new));

  // Step 1: we PUT value to map, but...
  stringMapMap.putIfAbsent(methodName, new CopyOnWriteMap<>(WeakHashMap::new));

  // Step 2: `map.get` returns NULL!
  stringMapMap.get(methodName).putIfAbsent(args, accessor); // here `stringMapMap.get(methodName)` is NULL

Assumably, null is returned after GC run.

asolntsev commented 2 months ago

My recommendation:

  1. Fix issue https://github.com/datafaker-net/datafaker/pull/1310 - @snuyanzin Can you help with this?
  2. Merge https://github.com/datafaker-net/datafaker/pull/1311