tuomount / Open-Realms-of-Stars

4X Strategy game
GNU General Public License v2.0
134 stars 32 forks source link

Using external JSON de/serialization library #665

Closed BottledByte closed 11 months ago

BottledByte commented 11 months ago

The game's code contains custom JSON parser. While it gets job done, it's functionality is very basic. It cannot serialize objects, it cannot deserialize into objects, it lacks many quality-of-life features and it is additional code that has to be maintained.

Adoption external library for working with JSON format would fixed all of the above mentioned issues, with addition that programmers might know the library from other projects. So it could be more comfortable and easier using it instead of the custom solution. And it will provide new options for working with game data.

As the game will unlikely ever deal with JSON and/or other formats excessively, using smaller de/serialization library, like jackson-jr, should be IMHO sufficient.

tuomount commented 11 months ago

Reason why it uses my own JSON library is that Jackson is with Apache license which is not compatible with GPL2. I don't know if there is GPL2 JSON parser. If that's the case that library could be separated as own project, so then there would a GPL2 license JSON parser.

That custom parser lacks deserialize feature by design since it may open vulnerabilities that allow running external code in your project.

BottledByte commented 11 months ago

I am forgetting that this project is using GPLv2.

Anyway, there is org.json aka JSON-java. It is in public domain (making it AFAIK compatible with GPLv2). It is maintained and although it is maybe not as feature-rich as the other libraries, it seems to still be superior to current solution, as it can also serialize objects.

tuomount commented 11 months ago

Yes, Public Domain is probably compatible with GPLv2. I would need to verify that.

Currently JSONs are only used with HUE lamps. I don't see big benefit changing current JSON parser to new library. Serialization is easy for creating objects, I understand that, but almost every big Java vulnerabilities have been in serialization since it may allow executing code. For example Log4J was serialization issue, Jackson has had couple of serialization vulnerabilities. I am sure custom parser does not have those since, it does not support serialization.

Also I have fuzz tested couple years ago and fixed those found issues. So I my guess is it is pretty robust for what it is currently doing.

As I see Jackson is the JSON parser Java projects commonly use, I am not sure how common JSON-java is, so it might not it is not that common to other programmers. What game data where you thinking to get into JSON format? Also for end users changing the library does not bring any benefits, so I am not that keen to change it unless there is like advances, like storing some game data in JSON format.

For example it could be used for save files, it probably would increase the save file size, and might even slowdown the loading of save files. Also that library seems to support serialization of Java objects to JSON but not back from JSON to java objects.

Less there are third party libraries in project less like there needs to be done quick patching due some third party vulnerability. So currently there is just the OGG library which is delivered with actual binaries.

BottledByte commented 11 months ago

As I see Jackson is the JSON parser Java projects commonly use, I am not sure how common JSON-java is, so it might not it is not that common to other programmers.

I found it after searching for JSON library after 2 minutes. It is ranked 5th most used library. So not as much popular as Jackson, but still quite good. image

unless there is like advances, like storing some game data in JSON format.

:smirk: How did you guessed what I want more advanced JSON library for?! :smile:

I just finished a prototype of storing and loading some game data as JSON. First using Jackson-jr, but as you said it would conflict with GPLv2, I rebased my work on JSON-java and it works exactly as well, only with more boilerplate code.

If full JSON library gets added, I am planning to submit my changes and then dehardcode more game data. Even a minimalist library like JSON-java would do the trick (it has ~70 Kib, so even less that the OGG library).

For example it could be used for save files, it probably would increase the save file size, and might even slowdown the loading of save files

I don't plan JSONizing save files currently. As you say, it does not worth it. I am trying to move game definitions to JSON. It improves code clarity, shortens the gigantic Factory classes and allows for more modular design.

Also that library seems to support serialization of Java objects to JSON but not back from JSON to java objects.

Yes, you are correct. It does not have object mapping when deserializing, it has to be done by hand. :frowning_face: But it has API with generics, iterators and other modern stuff, so it is not as annoying to do it.

Less there are third party libraries in project less like there needs to be done quick patching due some third party vulnerability.

I agree that dependencies should be kept at minimum, to stay away from dependency hell. However, I don't think vulnerabilities are too important for a single-player, offline game. And JSON-java is a tiny, single-purpose library, not like the Hibernate framework :smile:

tuomount commented 11 months ago

Those gigantic factory classes in some other format would make sense, but is JSON good for it? JSON does not support comments, where as XML would. But neither of those aren't that fun to write either, but it would making modification maybe easier. At least I would prefer writing XML instead of JSON by hand, XML is a bit more readable, but that is just my view. I cannot remember out of my head what contained each factory, would some simple one line for each component/item etc work. If each line would contain all the parameters.

Do you know where is the source code for that JSON-in-Java library? I checked the maven repository and it looks like it has two high vulnerabilities each year: https://mvnrepository.com/artifact/org.json/json

Vulnerabilities are not bad, since most user probably do use it without the HUE lamps and using those also requires manual configuration has some kind of certificate pinning, which would require to know at least which kind of ID that used HUE bridge has. But as my day job I work with vulnerabilities and I would like to do something else on my free time :smile:

BottledByte commented 11 months ago

I will summarize what's my point, in order of importance:

  1. I want to move as much as possible game data definitions out of Java code.
  2. It must be some common format, not a custom one. I don't care it it is JSON, XML, YAML, TOML...
  3. I don't want to reinvent the wheel, writing custom parsers/serializers. That's a lot of work and it is bound to introduce custom errors/bugs/vulnerabilities.
  4. I don't want to pull in large and/or unnecessary dependencies.
  5. I want simple API to work with stored data as I am used to highly modern languages. (Which Java really isn't. It got var and Stream, at least :pray: )
  6. Since there is JSON used already, I thought having game data in JSON and using single 3rd-party library would kill two birds with one stone, while keeping dependencies at minimum.

If you know how to achieve simple storage of game data definitions in standardized format, while not making work with it pain, don't use 3rd-party libraries for that, yet not reinvent the wheel and write things from scratch... then you are better than me :smile:. I am trying as best as I can to find a solution.

I guess that if you do not approve/find some suitable 3rd-party library, I will at least try to use existing in-house JSON parser. Prototype number 3 of factory breakdown incoming :smile:

Those gigantic factory classes in some other format would make sense, but is JSON good for it? JSON does not support comments, where as XML would. At least I would prefer writing XML instead of JSON by hand, XML is a bit more readable, but that is just my view.

JSON-java is able to parse/output XML to/from JSON (see https://stleary.github.io/JSON-java/org/json/XML.html), so it can cover both JSON communication with HUE bridge as well as game data in JSON or XML.

Do you know where is the source code for that JSON-in-Java library?

I have already linked it in first comment when I mentioned it, but here is the link in raw text: https://github.com/stleary/JSON-java/ Also, the library is in maintenance mode it seems, so the amount of critical vulnerabilities is hopefully limited.

Which is maybe better that it is maintained than is situation with jOrbis, which is unmaintained for some 6-7 years now :upside_down_face:. And I doubt it is without vulnerabilities, they are probably just not discovered/reported as the library has only few dependents.

When mentioning jOrbis and it's age - it messes up audio on my modern Linux system each time I quit the game. (But now I am really mixing up issues! Bad me, I will stop here :joy: )

as my day job I work with vulnerabilities and I would like to do something else on my free time 😄

I can understand that :smile:

tuomount commented 11 months ago

Oh I thought that JSONJava and JSON-in-java are two different libraries. They even had different logo in maven and github.

I would say there are 3 options for those factories: XML, JSON and CSV YAML looks a bit difficult to read, TOML seems to be design for INI files.

There is XML Parser built in Java, but what I have used it is not that easy to use. Also XML is quite bloat. Not sure if anyone actually likes writing XML. JSON is a bit less bloat than XML, but it does not support comments. I think both XML and JSON would have problem that one would need to remember what attributes/elements are available or we would need to do schema file.

In CSV these could be told in title row, which might make modification a bit easier, but on the other hand big files might still look ugly. CSV is not that complex and there might be suitable library for it. Or I could make one :smile:

So I am guess you are dumping all thingies from factory into JSON and see what it looks like. It would be interesting to see at least very small portion of one factory in XML, JSON and CSV, and what looks best or is most editable and go with that. Content in easy editable file could improve possible modifications. But anyway I am starting to lean towards JSON. :smile:

Yea, That JOrbis is old and it looks like it is copypaste from C-version. But since it reads local and fixed files from JAR, I am not that worried about vulnerabilities there. But if does not work, then you should make issue of it and make there is an another library for Ogg files. I do not experience any audio issue with JOrbis on my Linux machines.

BottledByte commented 11 months ago

CSV is not that complex and there might be suitable library for it.

CSV cannot represent hierarchical data easily, which makes it impractical. It can be hacked to do that, but it would make it confusing. As I work with CSV files from time to time, I assure you that CSV might look like a good, simple format... But it is in-fact one of the worst in long-term. So please, no CSV :smile:

But anyway I am starting to lean towards JSON. 😄

It is the best option, IMHO. There are several open source games which use this format for game data definitions and moddability. The most extreme example I know of is CataclysmDDA project https://github.com/CleverRaven/Cataclysm-DDA . Last time I checked, they had around 60 Megabytes of JSON data! The amount of content in that game is stunning and AFAIK over 90% of it is in JSON, from terrain, maps and items to monsters and NPCs. They solved the need for occasional comment by using "//" field in objects for that purpose.

Anyway, I say this issue got quite derailed already. Let's move this discussion to #666

tuomount commented 11 months ago

CSV cannot represent hierarchical data easily, which makes it impractical.

This is absolutely true. So CSV is not good choice for this. So Since discussion is moved to #666 I think this can be closed then?

BottledByte commented 11 months ago

I would keep this open for now. We'll see how discussion will go regarding de-hardcoding game data and maybe return to this.

BottledByte commented 11 months ago

If I understood final conclusion made in #666 correctly, JSON-java aka org.json aka JSON-in-Java library will be probably used instead of current in-house parser.

Apart from the obvious benefits of using the library, such as better and "well-known" API and more features, I want to add next thing as a "cherry-on-top", why it is a good idea to move to external JSON library.

The in-house JSON parser contains a critical bug. It seems that it is unable to correctly parse following valid JSON data:

[ {
  "kind" : "TRADE_EMBARGO_REALM_CHOICE",
  "texts" : [ ]
} ]

I rest my case. :smile: Let's start using org.json library after 0.25.0 is released. :+1:

Addendum:
I spent like 2 hours to figure out what is causing the crash, if it is me or actual bug, because the in-house parser API was "not exactly ergonomic" and I had to fix like 4 of my misunderstandings of how the parser works. I am pretty sure it is not me, as I tested this for over an additional hour.

tuomount commented 11 months ago

If I understood final conclusion made in https://github.com/tuomount/Open-Realms-of-Stars/issues/666 correctly, JSON-java aka org.json aka JSON-in-Java library will be probably used instead of current in-house parser.

Yes

It seems that it is unable to correctly parse following valid JSON data:

[ {
  "kind" : "TRADE_EMBARGO_REALM_CHOICE",
  "texts" : [ ]
} ]

Seems it fails to parse empty array. It always expect that array has something inside. Obvious bug.

BottledByte commented 11 months ago

@tuomount Do you think you could look into pulling-in org.json aka JSON-Java and converting existing code that parses JSON to it? Since JSON parsing is AFAIK only in the HUE Bridge communication code and I cannot test it in any way, it would be just completely blind refactor if I would do it (read broken :slightly_smiling_face: ).

Also, there are methods/classes in that network code that got deprecated in newer Java versions. If you could migrate those as well, it would be great - my IDE would stop pestering me with messages that there is deprecated code and give me more useful hints :slightly_smiling_face: . The methods/classes are:

I don't want to pull-in the library while the in-house JSON parser is still present and in-use, but I could use to have it in my toolbelt soon (that is, on master branch). I am getting allergic to all the hardcoded content. Some of it introduced by me with RaceTraits :smile: .

tuomount commented 11 months ago

Are you sure about that javax.security.cert.X509Certificate class. BlindTrustManager class uses java.security.cert.X509Certificate which suppose to be still valid in Java21.

joinGroup was easy fix, when I found how to make it with old usage without need to choose interface.

BottledByte commented 11 months ago

Sorry, my fault. The deprecated stuff is usage of getSubjectDN() and getIssuerDN() methods. I only looked-up the wrong class :slightly_smiling_face:

tuomount commented 11 months ago

This parser is somewhat better(less boilerplate, after using utility) than my JSON parser but also annoying. Methods do not generally return null, instead they throw exceptions which kind of annoying, but as a work around I made utility class that catches exception and returns null. But then there is issue with HUE registration. HUE Bridge returns JSON array if there is an error and JSONObject if something is successful. There is no method to parse what ever JSON available. In my parser there was this JsonRoot which could handle what ever JSON type at beginning.

But if there are no errors from bridge lights are working. It is just the registeration part that cause now the issue. I'll continue with this tomorrow.

BottledByte commented 11 months ago

I appreciate you are working on it, @tuomount. :+1:

However, it appeared suspicious to me that you say the library is throwing exceptions non-stop... so I did you a review :smile:. Here is feedback and some directions

  1. The getXXX(key) methods you use declare that the key is expected to exist and be non-null. Hence the exceptions on missing/null keys. You are just using wrong part of the library for the task :slightly_smiling_face:
  2. Use optXXX(key, other) methods for that instead. They don't throw exceptions and will return you on error what you tell them to (like null).
  3. But don't use optXXX(key) methods - they return different things than what programmer might expect (some return empty strings instead of null, for example). Stay with the explicit optXXX(key, other) methods.
  4. Remove "JsonUtil", please :smile: That's a horrible hack, wasting runtime resources and duplicating and mangling the API that the org.json library already exposes anyway.
  5. "De-nest" the if statements (use "if bad return" instead of "if ok execute body"). It is pain to read it, it may disrupt version control, it is hard to refactor, it wastes precious whitespaces... :smile: You can "de-nest" ifs to a linear sequence instead of depth of 5+ in Bridge.updateAllLights(), for example.
  6. Amend the last commit which adds the "JsonUtil" with above feedback. "JsonUtil" should never even enter to history of master branch :smile:
  7. Take a look on JSONTokener's nextValue() for your issue with JSONArray vs JSONObject in a response from the HUE bridge. It can be used by simple instanceof and/or mapping to likely get you what you need.
  8. You might discover logical hiccup with redundant object allocation in that method after de-nesting the ifs. :wink:

If I will find anything else I consider clunky in the new JSON code post-merge to master, I will open a PR for refactor. You can bet on that :smile:

tuomount commented 11 months ago

Ah, optXXXX(key) works as my utility. I looked example how to use this API and it used getters. Currently there is still problem that optIntegerObject works quite interesting. It returns value 0 if value is not present in JSON. This does not work, since values can be between 0-255 or 0-65535 in hue lamps. If lamp is not color lamp it is missing the hue value totally, but this optIntegerObject returns zero. I need to read more about javadocs, I hope I don't need to use tokener.

Also what I noticed that there is no method that would go through full json to search certain key and pick first one it encountered. Instead I needed to find each level and then look key from there and then next and so on. This is because each level is hashmap and opt and get only fetches keys from that level. That also caused some pondering why error handling was not working.

tuomount commented 11 months ago

Okay, I finally got this working. There was optIntegerObject(key, defaultValue) method which did the trick. (Like you said earlier @BottledByte ) Merged in master. optString(key) seemed to work. I think it might return something weird if there is key which matches for JSONObject and not string.

BottledByte commented 11 months ago

optString(key) seemed to work

Only seemed to work. But it does return "" instead of null, and "" != null, so several branches in your code could never get called. Fixed in my PR.

BottledByte commented 11 months ago

org.json library is now used in master branch. Despite org.json library being rather primitive, it is still better and more featured than former in-house parser. :+1: Closing.