.json IDs are capped to 65,536

Bakelt commented 1 week ago

We are probably overwriting some texts in various places with bad IDs that exceed the 65,536 cap.

We have IDs using 100000-100025 that should be redone to valid numbers.

We may also want to consider a re-org of IDs to have them categorized as trying to just increment after last ID is painful when the IDs are shared between all .json files

Doxorn commented 1 week ago

I think I could write a python script to to go through all the files and reassign new unique key values across all files to all entries according to set rules. Is the above list of files containing unique key values comprehensive?

CollinHerber commented 1 week ago

I'm a bit confused by this. I've definitely seen the IDs that are in the high numbers work. Are we sure we aren't talking about actual entry limits and not the numbers themselves?

Bakelt commented 1 week ago

Ran a quick test. Setting one text to the %65536 of the other causes the second string to no longer work. Currently, our 100k+ string IDs are only working because nothing is using their %65536 version.

Having grouped IDs would also be nice for making new tooltips in general. Ran into an issue with these before as the next string ID was being used an entirely different .json

Bakelt commented 1 week ago

I think I could write a python script to to go through all the files and reassign new unique key values across all files to all entries according to set rules. Is the above list of files containing unique key values comprehensive?

The main hurdle I see is that we likely don't touch half the .json files while Bonesy does. So we may need to take a look to see if those are sequential or not or if we'd end up having to just work around them.

I also don't know if we need to touch the other abnormal .json files like thul_rune_stacked.json

Doxorn commented 1 week ago

I think I could write a python script to to go through all the files and reassign new unique key values across all files to all entries according to set rules. Is the above list of files containing unique key values comprehensive?

The main hurdle I see is that we likely don't touch half the .json files while Bonesy does. So we may need to take a look to see if those are sequential or not or if we'd end up having to just work around them.

I also don't know if we need to touch the other abnormal .json files like thul_rune_stacked.json

Would a script that collects unused ID ranges for each file work for you?

Doxorn commented 6 days ago

Here is an extensive list of all the IDs used & unused (in ranges to be more readable) and IDs over 65536. Blizzard did overflow some IDs themselves.

All json files are extracted via Ladiks Casc Viewer from ..\data\data\local\lng\strings and data\data\local\lng\strings\metadata then overwritten by the changed json files found in the next branch

At the end there is a summary of the overall number of IDs used across all the files and each file name that contains an overflow ID.

Structure

File - name of the file Number of IDs - how many IDs the file uses ID Range - the Minimum and Maximum ID value present in the file Unused IDs - all unused IDs between the ID Range values above, represented in uninterrupted ranges Overflow IDs - every ID that are over 65536 in the file

How do we wish to proceed?

ID ranges across json files.txt

Here is a quick summary of all the files read and how many number of IDs are used in them, this maybe would help us figure out how large of an ID range we want to allocate each file when we reorganize them:

bnet.json - 185
chinese-overlay.json - 119
commands.json - 216
item-gems.json - 33
item-modifiers.json - 416
item-nameaffixes.json - 1057
item-names.json - 2998
item-runes.json - 257
keybinds.json - 242
levels.json - 428
mercenaries.json - 149
monsters.json - 532
npcs.json - 39
objects.json - 190
presence-states.json - 16
quests.json - 159
shrines.json - 50
skills.json - 2119
ui-controller.json - 297
ui.json - 1916
vo.json - 1726
metadata-audio-vo.json - 2001
metadata-comments-item-modifiers.json - 11
metadata-comments-item-nameaffixes.json - 785
metadata-comments-item-names.json - 1249
metadata-comments-monsters.json - 386
metadata-comments-npcs.json - 35
metadata-comments-objects.json - 5
metadata-comments-skills.json - 42
metadata-comments-ui.json - 11
metadata-states.json - 2528

Bakelt commented 6 days ago

Basically everything is sub-30k. and mixed between files so trying to redefine ranges for ONLY that file would mean a big tear up. So for the lazy "lets not redo everything" and reduce the amount we're changing we cab make a clean cutoff of 30k and decide ranges based on what we've already done and then just redo the overflow sections.

Non-overflowing files to base ranges around: monsters.json (40,000 - 40,002) item-runes.json (30,000 - 30,020) item-names.json (32,000 - 32,041) and (50,403 - 50,413) item-modifiers.json (60,000 - 60,025)

Modifications then would only be for the bolded below and we can just define these ranges.

item-runes.json (30,000 - 31,999) item-names.json (32,000 - 33,999) and move the 50,403 - 50,413 into this range ui.json (34,000 - 34,999) skills.json (35.000 - 39.000) monsters.json (40,000 - 41,000) item-modifiers.json (60,000 - 61,000)

We'd still have a ton of unused space (41,000-60,000) in case we start modifying more .jsons

Doxorn commented 6 days ago

My idea would be giving each file an ID number buffer depending on how many number of ID digits it contains. Separating the metadata and non-metadata files. Since we are most likely never touch metadata files so they don't need as large of a buffer.

I did some calculation in excel with the limit of 65000 in mind. In the picture below and the google sheet version the blue cells are cells with formulas and yellow cells can be modified:

Here is the google sheet version if you want to give it shot: https://docs.google.com/spreadsheets/d/1ZzKpsrX0qq53cZgnWzk81RxARPPJ_0KVZfZeqhKqank/edit?usp=sharing

Doxorn commented 6 days ago

Basically everything is sub-30k. and mixed between files so trying to redefine ranges for ONLY that file would mean a big tear up. So for the lazy "lets not redo everything" and reduce the amount we're changing we cab make a clean cutoff of 30k and decide ranges based on what we've already done and then just redo the overflow sections.

Non-overflowing files to base ranges around: monsters.json (40,000 - 40,002) item-runes.json (30,000 - 30,020) item-names.json (32,000 - 32,041) and (50,403 - 50,413) item-modifiers.json (60,000 - 60,025)

Modifications then would only be for the bolded below and we can just define these ranges.

item-runes.json (30,000 - 31,999) item-names.json (32,000 - 33,999) and move the 50,403 - 50,413 into this range ui.json (34,000 - 34,999) skills.json (35.000 - 39.000) monsters.json (40,000 - 41,000) item-modifiers.json (60,000 - 61,000)

We'd still have a ton of unused space (41,000-60,000) in case we start modifying more .jsons

Some overflowing IDs are in conflict with your proposed changes:

ui.json

ID overflow ranges: 97000-97055 %65536= 31464-31519

skills.json

ID overflow ranges: 100000-100025 %65536= 34464-34489

edit: accidentally used the wrong modulo number, fixed it

Bakelt commented 6 days ago

I'm saying we just redo the overflowing ones to the new proposed range definitions and reduce the number of overall files we are editing.

We'd only have to update 3 files (skills.json, ui.json, item-names.json) and only a few dozen IDs rather than every ID. It is not the cleanest looking solution, but the least intrusive.

VtorHdev commented 8 hours ago

I have another idea (I don't know if it's possible). What if we try to create a list of IDs that are NOT being used? Like, run a script that looks for all the IDs currently in use from all the JSON, and create a txt with all the free IDs <65536. If we want to add something, we just use those IDs, and remove them from the list.

CollinHerber commented 7 hours ago

Seems like overkill, would rather just have the numbers easily incremental in the file.

Bakelt commented 7 hours ago

I'd rather not have to reference an external document every time I want to make a new entry.

Ultimately, we just need to decide if we want to redo some old IDs or not. Personally, I don't care which of the methods we go with as I won't be doing the work.

VtorHdev commented 7 hours ago

Yep, my goal was to keep the original IDs and take advantage of the available slots, but having to rely on a list and having to update it manually, although easy, is a pain in the long term. For my part, I will adapt to whatever is decided as the best solution (since I don't use scripts, it would take me forever xD)

D2R-Reimagined / d2r-reimagined-mod