entur / lamassu

Mobility hub
European Union Public License 1.2
5 stars 7 forks source link

Unparsable feeds don't get updated nor are their metrics updated #453

Open hbruch opened 1 month ago

hbruch commented 1 month ago

Expected behavior

In case a feed is not parsable, e.g. due to malformed json, this should result in a properly logged exception and a metric should reflect this.

Observed behavior

In case of an unparsable JSON, GbfsJsonValidator.validate throws an uncatched Exception, which prevents that the delivery is consumed.

{"serviceContext":{"service":"lamassu"},"message":"Caught exception in ForkJoinPool\norg.json.JSONException: A JSONObject text must end with '}' at 7864 [character 7865 line 1]\n
 java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)\n
 java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)\n
 java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)\n
 java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:540)\n
 java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)\n
 java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670)\n
 java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)\n
 java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)\n
 java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)\n
 java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)\n
 java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:765)\n
 org.entur.gbfs.GbfsSubscriptionManager.lambda$update$0(GbfsSubscriptionManager.java:91)\n
 java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)\n
 java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)\n
 java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)\n
 java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)\n
 java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)\n
 java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)\nCaused by: org.json.JSONException: A JSONObject text must end with '}' at 7864 [character 7865 line 1]\n
 org.json.JSONTokener.syntaxError(JSONTokener.java:503)\n
 org.json.JSONObject.<init>(JSONObject.java:252)\n
 org.json.JSONTokener.nextValue(JSONTokener.java:409)\n
 org.json.JSONArray.<init>(JSONArray.java:105)\n
 org.json.JSONTokener.nextValue(JSONTokener.java:416)\n
 org.json.JSONObject.<init>(JSONObject.java:237)\n
 org.json.JSONTokener.nextValue(JSONTokener.java:409)\n
 org.json.JSONArray.<init>(JSONArray.java:105)\n
 org.json.JSONTokener.nextValue(JSONTokener.java:416)\n
 org.json.JSONObject.<init>(JSONObject.java:237)\n
 org.json.JSONTokener.nextValue(JSONTokener.java:409)\n
 org.json.JSONObject.<init>(JSONObject.java:237)\n
 org.json.JSONObject.<init>(JSONObject.java:402)\n
 org.entur.gbfs.validation.validator.GbfsJsonValidator.parseFeed(GbfsJsonValidator.java:150)\n
 org.entur.gbfs.validation.validator.GbfsJsonValidator.lambda$parseFeeds$2(GbfsJsonValidator.java:144)\n
 java.base/java.util.HashMap.forEach(HashMap.java:1429)\n
 org.entur.gbfs.validation.validator.GbfsJsonValidator.parseFeeds(GbfsJsonValidator.java:144)\n
 org.entur.gbfs.validation.validator.GbfsJsonValidator.validate(GbfsJsonValidator.java:69)\n
 org.entur.gbfs.loader.v2.GbfsV2Subscription.validateFeeds(GbfsV2Subscription.java:125)\n
 org.entur.gbfs.loader.v2.GbfsV2Subscription.update(GbfsV2Subscription.java:106)\n
 java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)\n
 java.base/java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3612)\n
 java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)\n
 java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)\n
 java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)\n\t... 5 common frames omitted\n","severity":"WARN","reportLocation":{"filePath":"org.entur.lamassu.leader.FeedUpdater","lineNumber":"107","functionName":"lambda$start$0"}}

Version of lamassu used (exact commit hash or JAR name)

https://github.com/entur/lamassu/commit/373e17a1f0ca47a2a2119a5d3d9ab36e47ef3807

Suggested change

IMHO, GbfsJsonValidator.validate() should be changed to record parse exceptions per file als FileValidationError.

testower commented 1 month ago

This is a downstream repetition of an already existing problem. We don't have visibility into parse failures in the loader either. That said, we should definitely catch JSONException in the validator and re-throw a more specific exception.

Some possible solutions:

Assuming that a parsing problem in the validator is subsumed by an equivalent parsing error in the (file-)loader, do we need to distinguish between them? In other words, do we want the validation dto's to carry general information about errors encountered during loading?