OverlordsIII / VillagerNames

A mod that adds names to villagers and golems
https://www.curseforge.com/minecraft/mc-mods/villager-names-for-fabric/
Creative Commons Zero v1.0 Universal
8 stars 7 forks source link

Some general code quality improvements and 2 small fixes #6

Closed Fourmisain closed 3 years ago

Fourmisain commented 3 years ago

This is currently not intended to be merged, it should be fully functional though. Have a look and tell me what you think!

Find 1

While looking through the code I noticed new Random instances are created for each villager/golem name, which is not only inefficient but can heavily impact the random generation:

https://github.com/OverlordsIII/VillagerNames/blob/26f931257e26b91db0acc65da09d236ba6e84eb0/src/main/java/io/github/overlordsiii/villagernames/util/VillagerUtil.java#L46 https://github.com/OverlordsIII/VillagerNames/blob/26f931257e26b91db0acc65da09d236ba6e84eb0/src/main/java/io/github/overlordsiii/villagernames/util/VillagerUtil.java#L58

This is easily fixed by just creating one static Random instance and reusing it.

Find 2

I also noticed this check before updating the name of a villager gaining a profession:

https://github.com/OverlordsIII/VillagerNames/blob/9b99751226e50c74f8060489fe22701898b2c738/src/main/java/io/github/overlordsiii/villagernames/util/VillagerUtil.java#L87

It would block any villager with "the" in their name (like Matheus) from getting a profession text. The easy fix is to use " the ", although I'm not sure why the contains check is there in the first place, can a villager get "updated" without changing his profession?

General

While reading some more I saw a lot of code duplication for anything villager and golem name related, so I started refactoring. E.g. GolemNameArgumentType and VillagerNameArgumentType are moved into one neat little package named NameArgumentType which contain Golem and Villager inner classes and the golemName() and villagerName() helper functions. The villager and golem configs are also abstracted a little bit so it's very easy to have a general pickRandomName().

I also touched some of the inconsistent code styling, but quickly gave up on it because I'm not sure in which direction to go, e.g. is it if (){ or is it if () {?

Changelist

Some more questions

I don't understand the reason for FormattingDummy, can't Formatting be used directly?

I also don't quite understand the random name generation, usedUpNames sounds like a name should never be picked again, instead these names gets rerolled once. With the current amount of names (6700+ for villagers) it is already very unlikely to have common names, so I'm not entirely sure why it is done this way.

OverlordsIII commented 3 years ago

This looks like good stuff, thanks for taking the time to do this!

Sorry for the late response, I forgot to check on the repo in a while, sorry about that!

As for some of your findings

  1. This is a valid point, and I guess should be quite easy to fix, I fully agree with what you are saying

  2. This should be changed to " the ", but as for why the contains check exists in the first place is used when either catching something that createVillagerNames missed or when used for children names feature

  3. FormattingDummy is used to make sure that its toString representation is properly used by the config. The config uses an api called ClothConfig and when using enum types, uses its toString representation. The problem is that minecraft's formatting as a string is represented as a formatting code with the wierd symbol, so the selection box for the villager text formatting would not carry the correct values, and would instead have a wierd symbol for the user to select.

Basically, if we use minecraft Formatting instead of it using the normal text of the specific formatting, it will use something like this: §4 instead of RED

  1. I guess this is a fair point, I think it should eventually be rolled back though, maybe when its size is equal to the villager names list size, so around 6700 or so

Overall though, most of the changes make sense and I am really thankful that you would spend the time to do this. I recently started minecraft modding and coding in general, so I hope I learn more along the way.

Fourmisain commented 3 years ago

1. This is a valid point, and I guess should be quite easy to fix, I fully agree with what you are saying

It's fixed in this PR, so if you decide to merge, nothing more needs to be done. I initially said this PR isn't yet intended to be merged, mainly because I haven't fully tested every change, but I'm pretty confident everything is working. Most changes are refactoring through my IDE, so there is little chance of error.

I just noticed Github says "Draft pull requests cannot be merged" and now I'm confused. I thought drafts could be merged because I'm 99% sure I had a draft in the past (on another account) that got merged earlier than I wanted it to.

Oh well, in that case, it's now open and ready to merge!

2. This should be changed to " the ", but as for why the contains check exists in the first place is used when either catching something that createVillagerNames missed or when used for children names feature

Hm... but children can't have a profession, right? The contains check is just after the check if the villager has a profession. I can't say I fully understand, but it's probably best to not mess with it and just leave the check in.

3. FormattingDummy is used to make sure that its toString representation is properly used by the config.

Ah, got it.

4. I guess this is a fair point, I think it should eventually be rolled back though, maybe when its size is equal to the villager names list size, so around 6700 or so

Ah, I think there's some misunderstanding here, I'm not saying how the random name generation should work, I'm just trying to understand the reasoning behind how it currently works.

Like this is the current implementation (in this PR, but master works exactly like this):

        List<String> names = namesConfig.getNameList();

        //pick a random name
        int index = random.nextInt(names.size());

        //when more than half of all names were "used up", unmark all names
        if (usedUpNames.size() > names.size()/2)
            usedUpNames.clear();

        //if the name was "used up", reroll it once
        if (usedUpNames.contains(names.get(index)))
            index = random.nextInt(names.size()); //confusing: this new name could still be "used up"

        //mark the name "used up"
        usedUpNames.add(names.get(index));

        return names.get(index);

I'm basically asking what 'issue' the "used up names" concept is solving, because if you compare the above implementation with the most basic one:

        List<String> names = namesConfig.getNameList();

        //pick a random name
        int index = random.nextInt(names.size());

        return names.get(index);

common names should still be pretty rare because there just are so many of them.

I just did some math, so here are the numbers.

With the basic implementation, the chance that the first 100 villager names are unique is (6700/6700 * 6699/6700 * 6698/6700 * ... * 6601/6700) or 47% which is already pretty high.

With the current implementation, all 100 names are "used up" and hence rerolled if they were to collide, which means the chance that all 100 names are unique should look like this: ((6699/6700 + 1/6700*6699/6700) * (6698/6700 + 2/6700*6698/6700) * ... * (6601/6700 + 99/6700*6601/6700)) or 99.2%! (edit: This math is slightly off, but the 99.2% is surprisingly still correct) (edit2: I think it was just an off-by-one error)

So I guess I answered my own question, the current implementation does make common names practically (though not actually) impossible.

If you want to guarantee unique names though, the easiest way is to just shuffle the list once via Collections.shuffle(list); and progressively take the ith element (reshuffling the list and resetting i once reaching the end).

If I were to now talk about how it should work, I don't think it's bad to have a chance of duplicate names, it is human afterall, so personally I'd probably use the 'basic' implementation.

(Maybe I'll implement both variants and a config option in another pull request.)

OverlordsIII commented 3 years ago

I'm going to level with you here The whole reason that last feature (the one for making sure names dont get repeated) was added very early in this mod's lifecycle, when we only had 300 names to use. Now that we have more, it's not that useful. This comment here

if (usedUpNames.contains(names.get(index)))
            index = random.nextInt(names.size()); //confusing: this new name could still be "used up"

is just there because I was counting on the name just to not be in there. Even if it is, it lowers the chance more. But honestly I agree with what you are saying, so I am just going to check a final look around before merging.

OverlordsIII commented 3 years ago

Hmmm I'm not sure why github didn't add you to the contributors list.... if you want to, just make a dummy pr and ill accept it to get you on there

Fourmisain commented 3 years ago

Ah, I think that's because I left the git author email field empty. Github doesn't seem to think my commits are from me - not sure how to fix that, but it's no big deal, the PR itself is documentation enough and I saw you added me to the authors in fabric.mod.json as well and that's already more than I would ask for.

edit: figured it out, so next time Github will know