yairm210 / Unciv

Open-source Android/Desktop remake of Civ V
Mozilla Public License 2.0
8.49k stars 1.57k forks source link

Feature request: UiElementDocsWriter regex issue improvement #9633

Open hackedpassword opened 1 year ago

hackedpassword commented 1 year ago

per conversation in #9619

Moving issue here. More to follow.

hackedpassword commented 1 year ago

Context: That thing goes over Unciv's own sources and looks for calls to two specific functions that manage "skinnable UI", then tries to auto-write documentation.

Ok, so I see there's 3 chunks of extensive regex. I can read them, mostly standard. If there's any modifiers that act kotlin-specific, I'll double check with kotlin's implementation. To me it's basically a mux of python and java. I'll take some time assessing the code. So far looks pretty straightforward, as long as OOP methods and instancing don't get too involved. (the core diff between programmers and scripters)

@SomeTroglodyte if you have any specific examples of what this code is doing right/wrong, or why/how it's causing trouble other than being clunky regex, that would assist in my interpretation and assessment. Not necessary but helpful.

SomeTroglodyte commented 1 year ago

mux of python and java

kotlin has a few backends, for Unciv it is Java - or, it compiles for the same VM as Java. Has some marginal relevance as on some phones Regexes actually are more limited than on desktop. But these Regexes won't run on phones (uff, narrow escape).

specific examples

At the moment it doesn't make grave mistakes, though I think it might be too fragile. Well, it can't really ruin anything but docs or prevent startup on desktop... For UI colours -just maybe- one could think of a better output.

e.g. from this: private val queuedTechColor = skinStrings.getUIColor("TechPickerScreen/QueuedTechColor", colorFromRGB(7*2, 46*2, 43*2)) it makes this | TechPickerScreen/ | QueuedTechColor | 7*2, 46*2, 43*2 | | and a little later the programmer used table.background = skinStrings.getUiBackground("TechPickerScreen/Background", tintColor = queuedTechColor.darken(0.5f)) and the markdown says `| TechPickerScreen/ | Background | null | |' Remember goal is to support someone putting together json and ninepatch images for a UI skin - is this helpful? Limited. Could it be better? Maybe? One, for a background the doc -by design- shows if a specific offer for defaults is used, but not the tintColor - which is only used together with that default, if the skin provides the element then it's the skin controlling tint. For colours not directly related to a background, there's a few coding options, and showing the default maybe strips too much context - or not enough.

The problem with this sample is tiny - one, markdown interprets those multiplications as italic (follow link and switch to Preview to see). Two - we've cut the colorFromRGB call - would that have been useful for the skin author to know? Three - there's no way in hell to automatically parse that the coder later divides the RGB channels by two, so, where to draw the line?

Your help would be appreciated in the directions - validation (no crass mistakes? complicated where simple is possible?? "clunky" becomes "elegant" how?). Readability for other coders. Ideas.

hackedpassword commented 1 year ago

Got my head wrapped around UielementDocsWriter.kt. The code's still warm to the touch too! History on the file says you committed it recently. Your regex is good. Fantastic use of grouping and even better use of sending captured strings into later code.

I spent some quality time pulling apart the first expression. Ref'd the MDN implementation of regex that Kotlin uses, validated against comprehensive docs on named groups etc., and despite the "clunkiness", there's nothing wrong with how you built the code. I might have split and documented slightly differently, but even then, I still went back to reading your comments per your groupings which I think for the expected readability of developer code, does what its supposed to.

Hmm, step 2: understand the bigger picture. Done.

Step 3: Do it my way, see how that compares/aligns. I'm going to try some stuff locally. Probably in perl.

So to be clear, we want that markdown table looking like however the function/elements are expressed in the code? I'm going on that. I guess the only part I don't get atm is how the code is executed to produce the table. As in, is unciv built via compiler, and along the way some builder code runs? Knowing how this works would apply to the next step of translating whatever I might come up with to making it work here.

hackedpassword commented 1 year ago

"Monster" indeed. As exceptionally clever and effectively cleanly presented as is, and even though the regex's are broken into 3 distinct search patterns, they're doing too much. Each. And the code trying to choke down what the regex hands off is working too hard also. imo

The following are observations not criticisms. I get the rationale in building for a wide range of circumstances.

I picked perl for two reasons - 1. been using it since the late 90's. 2. this scenario is what perl is built for. The single most difficult problem to solve here is the variable occurrence of ()'s in the second/third/other side of the function calls. Functions are inconsistently implemented as single or multi-line blocks, plus functions with same-path elements with variable element assignments to variables. Not even perl's powerful (?R is handling these things with //gs context. The problem needs to be broken down into smaller pieces. That's where your code beautification will come into play.

As a stopping point for today, I recommend a refactor. There's so many corner cases to those 3 functions each could be it's own .kt with parent code orchestrating. I'm contemplating other ways to come at this, and I'm starting to consider python approaches to modeling data files. The data needs to be handled differently before parsing. This is not a sed/awk problem one can synthesize a neat regex to hash solution without significant obfuscation. Then again, looking at the rest of the codebase, devs here obviously know what they're doing, so I could be way off in my current assessment.

SomeTroglodyte commented 1 year ago

don't get atm is how the code is executed

Simple, really. The project structure says if you followed standard procedure to set up a development environment, you'll find the source tree alongside where the assets are - so the desktop launcher simply walks const val srcPath = "../../core/src/com/unciv/" if it exists. My debugs run under a separate work folder (so I can control what specifically from any changed assests goes into the project), so I needed a few more symlinks.

Anyway, that first confirmation is already worth a lot. The whole idea was someone else's, but at some point I extended it, then a little later I rewrote the original regex too, to make them consistently commented... I guess boss merged the changes on trust and result, so I felt it was missing a little peer review.

hackedpassword commented 1 year ago

Right, that's the other diff between programmers and scripters - I've never used/set up a dev env. Ambitions to one day step up to devops.

Having thought about the issue here more, I'm still in agreement about a data modeling solution, as The Right Solution. Versus hacking the data up and sewing it back together {wrong! -dethklok}. I really commend how you built this regex. Personally I haven't implemented group naming before, combined with in-code hand-off is highly astute. Very impressive.

Unless you need me to move forward on anything here, I'm going to stand down.

Edit: not entirely though. I'm talking with gpt4 about the issue. I can't simply walk away from a regex puzzle. gpt is suggesting a data modelling regex hybrid using ANTLR and py ast, expanding on my idea. I'll prob continue to work on this anyway for the sake of the whole reason why I got into unciv in the first place - gamedev potential. Once the problem is solved one way, translating to kotlin should be the "easy" part.

SomeTroglodyte commented 1 year ago

never used/set up a dev env

Not a negative. Keep your distance and your sanity...

walk away

The use case is so sideline, why not. Btw - to see the end result, look for mods babylon UI or gold UI. Programmer -> code -> UiElementDocsWriter -> wiki -> modder -> json -> UX

hackedpassword commented 1 year ago

fyi I'm slowly working on coming up with some solution to this. Sort of a side project.

hackedpassword commented 1 year ago

Here's what I've come up with: https://github.com/hackedpassword/UiElementDocsWriter

I'd like to get the conversion in a proposal state soon.

github-actions[bot] commented 7 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 15 days.

hackedpassword commented 7 months ago

This is on my plate but I can't dedicate time to this atm. Or maybe I can just have to figure out some more effective time management.

SeventhM commented 7 months ago

Well, How about I give you some time with a bit of a label. This is too technical for me to understand what's going on anyways

hackedpassword commented 7 months ago

So here's the idea I had, one of those ahead-of-its-time types (at the time).

I wanted to construct an Unciv GPT that was trained in all things Unciv - code, issues, docs, source and commits, discussions, mods, even Fandom Civ V info. With this, I'd be able to produce content bypassing technical limitations. I mean, who writes in assembly these days? Use your resources.

Now constructing GPTs are common, but I had to switch gears with r/l priorities. Plan to implement remains.

SeventhM commented 7 months ago

I doubt an Unciv GPT would do all too much help, but sure, you can try. Just make sure to carefully look over everything it does

yairm210 commented 7 months ago

We will judge PRs by content, not creator, add we always have. If you submit a PR that you claim was delivered to you in a dream by our good lord John Cleese himself the methodology will not change. That being said, I'm highly skeptical of anything GPT that isn't equivalent to googling solutions or copy pasting with minor changes - but I'd be more than happy to be disproved