snowplow-archive / schema-guru

JSONs -> JSON Schema
http://snowplowanalytics.com
151 stars 20 forks source link

Right Double Quote Crashing schema-guru #182

Open nicholsmarkd opened 4 years ago

nicholsmarkd commented 4 years ago

I encountered an issue today when attempting to generate a json schema fro a json file. When I run schema-guru schema against a file that contains right double quote (0xE2,0x80,0x9D). It runs fine until it encounters this character at whicih time the program fails with the dump shown below. Is this a known issue and are there any workarounds for this that would allow schema-guru to handle these characters?

command: /schema-guru-0.6.2 schema --output test.json bad.json output: Exception in thread "main" java.lang.RuntimeException: Directory [C:\Workspace\MongoDB\data\json\bad.json] does not contain any JSON files at scala.sys.package$.error(package.scala:27) at com.snowplowanalytics.schemaguru.cli.SchemaCommand.processSchema(SchemaCommand.scala:96) at com.snowplowanalytics.schemaguru.Main$delayedInit$body.apply(Main.scala:21) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.App$$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) at com.snowplowanalytics.schemaguru.Main$.main(Main.scala:18) at com.snowplowanalytics.schemaguru.Main.main(Main.scala)

update: verified that 0xE2 0x80 0x9D string is causing schema-guru to crash as noted above. I used sed to find/replace and then was able to run file through to completion. I would still like to know why this string caused this error in schema-guru. I tested this same string in the jsonschema.net online converter and it processed it without error.

chuwy commented 4 years ago

Thanks for the report @nicholsmarkd. I don't think there are workarounds, but you can try it in on different platform (Linux) or newer JRE (though I need to say - chance of success is very low). Problem is likely in some underlying JSON library such as json4s or jackson that for some reasons struggle to handle this char. Schema Guru is in long stagnation these days, but we still do have plans to revive it one day.