QuiltMC / rfcs

Repository for requests for comments for proposing changes to the Quilt Project.
Other
61 stars 33 forks source link

RFC 19: Hashed Mojmap #19

Closed Earthcomputer closed 3 years ago

Earthcomputer commented 3 years ago

Civil discussion only please :)

Rendered View

spaceclouds42 commented 3 years ago

I am still concerned about the usability of hashed mojmap. Yarn mappings take time to update, and when developing on snapshots, new names usually take about a week to update, which by then, there's often a new snapshot out. With intermediary, this wasn't so bad, just a simple 5 digit number to remember for a bit. However, hashed mojmap is not only a mix of numbers and letters, it is also longer.

I understand the benefits of using hashed mojmap, and I think it's a good decision. However, to improve usability of unmapped (yarn) names, I suggest using numbers as a "second intermediary". Mappings would be mapped using mojmap, then hashed as described in this RFC, then numbered, 0..num_of_classes.

Below is a table showing how it would be far easier to use "second intermediary" when yarn hasn't mapped yet. It would still keep the benefits of using hashed mojmap as shown that a mojmap name will still map to the same "second intermediary", without the downside of being harder to use.

Mojmap Hashed Mojmap "Second Intermediary" Yarn
mc version x
MojOne Cls_8hg7e2 class_1 class_1, until mapped to YarnOne
MojTwo Cls_f4g6e9 class_2 class_2, until mapped to YarnTwo
mc version x.1
MojOne Cls_8hg7e2 class_1 class_1, until mapped to YarnOne
MojTwo Cls_f4g6e9 class_2 class_2, until mapped to YarnTwo
MojThree Cls_ik74cs class_3 class_3, until mapped to YarnThree
CheaterCodes commented 3 years ago

I am still concerned about the usability of hashed mojmap. Yarn mappings take time to update, and when developing on snapshots, new names usually take about a week to update, which by then, there's often a new snapshot out. With intermediary, this wasn't so bad, just a simple 5 digit number to remember for a bit. However, hashed mojmap is not only a mix of numbers and letters, it is also longer.

I want to reiterate, given that you have autocomplete (and I assume you do in most scenarios), you only need to remember at most the first 3 letters (even when using base26). The rest of the digits are to ensure that there are no collisions, but within the limited scope you'll usually work, having a collision with smaller hashes is still very unlikely. Think of the first few digits like git's short-hash.

Look at your own example given, the first digit alone in those examples is enough. The first two will be sufficient in most cases, the first three in realistically all scenarios. (Unless you have visibility over the entire class space as unmapped classes, then you might need digit 4 to differentiate.)

I understand the benefits of using hashed mojmap, and I think it's a good decision. However, to improve usability of unmapped (yarn) names, I suggest using numbers as a "second intermediary". Mappings would be mapped using mojmap, then hashed as described in this RFC, then numbered, 0..num_of_classes.

I'm not sure I fully understand? This sounds to me like the worst of both worlds. Wouldn't this require you to once again maintain a manually intermediary?

Below is a table showing how it would be far easier to use "second intermediary" when yarn hasn't mapped yet. It would still keep the benefits of using hashed mojmap as shown that a mojmap name will still map to the same "second intermediary", without the downside of being harder to use.

This table seems flawed. Intermediary mostly isn't class_1, class_2, class_3, but rather class_3127, class_3128, class_3129. unless you mean for second intermediary to only contain unmapped yarn classes? In that case, sure, we could automatically number missing yarn mappings.

I personally prefer if classes I use in the same context visually differ a lot, rather than only in the last digit. I often get confused using unmapped code in fabric, seeing so many similar class names next to each other.

spaceclouds42 commented 3 years ago

I'm not sure I fully understand? This sounds to me like the worst of both worlds. Wouldn't this require you to once again maintain a manually intermediary?

Because it is using hashed mojmap, it would still be 100% automated..? Might add a few seconds to run time, if that.

In that case, sure, we could automatically number missing yarn mappings.

hmm, I think that would work.

Earthcomputer commented 3 years ago

@SpaceClouds42 what makes you think that md_RJ... (which is what you'd need to type for auto-complete) is harder to remember and read than method_20495, particularly when method_20495 is likely in the same class as method_20496?

Kroppeb commented 3 years ago

I do like the letters, cause they are often easier to remember than long numbers. However, I don't really like base62 with its mixed case and numbers randomly spread around. I think it would be cool if we could make them more natural.

CheaterCodes commented 3 years ago

So, correct me if I'm wrong, but as far as I understood: Hashed-Mojmap wouldn't have a repo. There is no reason to store the mappings anywhere, since they can be recomputed easily any time. Was that never mentioned in the RFC? I feel like this is an important distinction to intermediary.

Rubydesic commented 3 years ago

The idea of directly using Mojmap is not getting nearly enough consideration.

There are unmentioned benefits:

Make reflection much more sensible

Reflection (and java.lang.invoke) are complicated and annoying to use in mods because of obfuscation. Modders working directly with Mojmap will not have to use classes like ObfuscationReflectionHelper from Forge or whatever.

Contributing towards unified mappings

Every Minecraft modding software using its own mappings has been a huge pain point in modding for long enough. If every major platform starts using Mojmap, that would eliminate a major annoyance. In fact, still requiring doing remapping when every other project has moved on to official mappings might be a downside for users looking to develop for Quilt.

Enable easy(-ier) development of cross-platform mods

Much in the same vein of the previous point, there are many mods which target multiple platforms, such as fabric, paper, and forge. Using a unified mapping makes the tooling required to do such cross-platform targeting much more tenable. Furthermore, multi-platform JARs will not need to include a copy of the code for every single mapping, which is bloated and messy, leading most multi-platform projects to need a JAR for each loader.

Enable cross-platform mod libraries

Ever had to use a library with different mappings than your own? yea... suffering. With this change library creators can have their library in mojmap, and even if your project is yarn you can just deobfCompile it. If you use mojmap, no need for even that. It will also work with forge and all the other modloaders.

Does not suffer many downsides of hashed Mojmap

Has many of the same benefits as hashed Mojmap

The main downsides are weak:

Taints Yarn contributors when seen in stack traces and unmapped code

Yarn contributors are a vast minority. Quilt's decisions should first and foremost benefit the general modding community.

Yarn contributors have annoying (but workable) workarounds they can employ, like using custom set of mappings (or just literally this hashed mojmap) so they aren't exposed to unmapped names in code, and NotEnoughCrashes or similar so that they aren't exposed to stacktraces.

Furthermore, the fear of Mojang being litigious wrt its mappings is, in my opinion, vastly overstated. It's a remnant of the absurd license they previously had and the hardline position the forge team previously took.

More likely to break mods than hashed Mojmap because no package name correction has been done

Has any investigation been done into precisely how much more likely this is? If "After applying the aforementioned corrections for package names, we found that Mojmap class names are twice as likely to change when the intermediary doesn't", then what about after NOT applying the corrections?

If it is very significant, it could be alleviated by remapping mods referencing the old package to the new one, much in the same way that intermediary-mapped mods will be remapped. Because it could reuse much of the same infrastructure, I doubt it would be a large maintenance burden.

Conclusion

Quilt should use plain Mojmap. The reasons to diverge from what the modding community in general seems to be gravitating to (Forge, Paper, Sponge), are weak, and Quilt being special just makes things difficult. If Quilt is going to try to iterate fast on something, I would much prefer it to be Mojmap and let yarn contributors decide if it's completely unworkable afterwards.

p.s. I imagine the proportion of yarn contributors voting on this RFC is extremely slanted - which stands to reason, given the demographic reading and commenting here. That said, I think it is causing the opinion on this RFC to be biased towards that which would benefit that vocal minority rather than the greater community, almost all of whom are still on Forge or Fabric and are not reading nor commenting here.

CheaterCodes commented 3 years ago

Make reflection much more sensible

Maybe that's just me, but helping people to use reflection would be at the very bottom of my interest list. Not sure what use cases you have for reflection, but for the most part there exist better solutions.

Contributing towards unified mappings

Enable easy(-ier) development of cross-platform mods

Both of these points don't really talk about intermediary or hashed-mojmap. They talk about replacing yarn with mojmap. And that is a discussion for the dev environment. I see no problem in enabling people to use mojmap in dev instead of yarn.

Does not suffer many downsides of hashed Mojmap

Now, the collision issue really is a non-issue, first of all. Second, easy to remember names is only a short-term advantage for the few newest classes not mapped in yarn, but sure, never having unmapped classes sounds nice.

Has many of the same benefits as hashed Mojmap

Yep

The main downsides are weak

My main downside: mojmap doesn't have the most amazing names. Sure, we could use some additional partial mapping, but mixing a custom mapping with mojmap in a way that makes sense would make the partial mapping highly tainted and legally very questionable to host.

I think the legal issue is relevant. Maybe not for a small mod developer who wouldn't even care much if the mod had to be removed, but a project that's meant to stay shouldn't rely on the goodwill of a third party.

Conclusion

I think the desire to use mojmap in dev is valid. I think there is plenty of good reasons to use mojmap instead of yarn. I don't think mojmap should replace intermediary. That would force people to use mojmap. That would force mod devs to rely on Mojang staying good willed.

You might say the legal argument is weak, but I think it's at least much stronger than any argument for mojmap to replace intermediary.

In dev, sure. In mod distribution? I'd rather not.

Rubydesic commented 3 years ago

Both of these points don't really talk about intermediary or hashed-mojmap. They talk about replacing yarn with mojmap. And that is a discussion for the dev environment. I see no problem in enabling people to use mojmap in dev instead of yarn

This is not true. Using mojmap in dev will not give the ability to make a mod compatible with multiple platforms in a single jar - you have to compile it for every single set of intermediary mappings you're targeting. This is annoying and terrible and a problem that the modding community seems to moving towards fixing by just using Mojmap - except Fabric/Quilt. Sure, it's possible to develop multi-platform mods without this change, but this certainly encourages it.

If you can always expect the prod environment to be using the same mappings, then that's just one less annoyance to take care of.

I think the legal issue is relevant. Maybe not for a small mod developer who wouldn't even care much if the mod had to be removed, but a project that's meant to stay shouldn't rely on the goodwill of a third party.

Both proposals are derived from Mojmap and its license. The only major legal concern (as far as I can tell) is Yarn developers getting tainted, and it seems that we are sacrificing a lot of convenience and conformance for little benefit except to these people alone.

That would force mod devs to rely on Mojang staying good willed.

Again, I don't think this is a major concern. Much of Minecraft modding already depends on Mojang staying good willed, if you look at the EULA - such is what one must accept when making mods for a proprietary game. After all, while they permit mods, remember they also wrote that "We have the final say on what constitutes a Mod and what doesn't." Not to mention, the mappings license is pretty OK now after all the backlash from the old one.

pluiedev commented 3 years ago

This is annoying and terrible and a problem that the modding community seems to moving towards fixing by just using Mojmap

This is incredibly apparent when you have to work with frameworks like Architectury to cover a wider audience. Sure, you can use Yarn in Architectury but the API itself is littered with names taken from Mojmap, and if you're a Yarn contributor, F for your (presumably already) polluted eyes.

Furthermore, the fear of Mojang being litigious wrt its mappings is, in my opinion, vastly overstated. It's a remnant of the absurd license they previously had and the hardline position the forge team previously took.

I agree; I'm not that concerned with Mojang suddenly demanding legal action on modders since they used their mappings outside of their control, and since other frameworks such as Forge and Paper are already switching, or have already switched to Mojmap anyway

My main downside: mojmap doesn't have the most amazing names.

I whole-heartedly agree, since Mojmap doesn't have to be bikeshedded by a group of hard-working volunteers for days and can be changed on a whim by a single dev

CheaterCodes commented 3 years ago

I feel like having a fat jar for different mod loaders would have bigger issues that just different mappings. Also there is still the option of remapping mods when loading them, similarly to what will happen when loading fabric mods.

Not exactly sure to which second proposal you're referring to, but assuming you mean hashed-mojmap: the difference here is that hashed mojmap doesn't redistribute mojmap in the compiled jars.

CoolMineman commented 3 years ago

image image

i509VCB commented 3 years ago

I feel this out of scope of this RFC but there needs to be a discussion to firmly set in place the full legal picture regarding yarn as much of it is hear say or long reach "this may be correct". We cannot have a discussion about whether something out truly taint yarn or not without a concrete answer from such a discussion which will not happen in this RFC.

Earthcomputer commented 3 years ago

As for the legal implications for distributing hashed mojmap. I see it as fine because although you can generate hashed mojmap from mojmap, you cannot generate mojmap from hashed mojmap. Saying you are redistributing mojmap by distributing a list of hashes is like saying that you can redistribute mojmap by giving each entry a number. In fact, giving each entry a number distributes more information than hashed mojmap, as it tells people the order the entries were in.

LambdAurora commented 3 years ago

In-depth reply of https://github.com/QuiltMC/rfcs/pull/19#issuecomment-841588768

Let's start by taking a look at the given points

Make reflection much more sensible

This seems to be dangerous to me, even though reflection can be useful, it's a double-edged sword. Mojmap or not, its use should be discouraged anyway as we still have many other tools that are faster at runtime, less-likely to break with Java updates, and handles remapping of targets (Mixin and other transformation tools). Sure those tools don't cover all cases but it covers enough for reflection to be occasional.

Contributing towards unified mappings

Totally valid, having a unified mappings would be better to everyone. But as a yarn modder and contributor, I would refuse to touch mojmap. That's highly biased, sure, and this argument does not really belong in this specific debate.

Enable easy(-ier) development of cross-platform mods

This doesn't solve the issue of cross-platform APIs, etc. You always end up needing a tool to dev and make those one-jar lot-of-modloaders mods.

It would be mostly a mess tbh.

Enable cross-platform mod libraries

Why not, would facilitate the previous point.

The main downsides are weak:

Yarn contributors are a vast minority. Quilt's decisions should first and foremost benefit the general modding community.

Yarn contributors are a minority, but yarn modders are not that much of a minority I believe. And I'm not sure all of them would be pleased to be exposed to mojmap.

Yarn contributors have annoying (but workable) workarounds they can employ, like using custom set of mappings (or just literally this hashed mojmap) so they aren't exposed to unmapped names in code, and NotEnoughCrashes or similar so that they aren't exposed to stacktraces.

User-given stacktraces should never ever be in mojmap, if I receive one in mojmap I won't be able to do the user-support. I really don't want to rely on some random mods to even let users get user-support. Though it has a benefit to be less work for mojmap users. But still doesn't solve the issue for yarn users.

Furthermore, the fear of Mojang being litigious wrt its mappings is, in my opinion, vastly overstated. It's a remnant of the absurd license they previously had and the hardline position the forge team previously took.

As far as I know, Mojang employees came into contact with modloaders people to talk about the licensing stuff, which resulted in the various licensing changes. But it doesn't mean we absolutely have to use those mappings.

If Quilt is going to try to iterate fast on something, I would much prefer it to be Mojmap and let yarn contributors decide if it's completely unworkable afterwards.

I'm not sure I appreciate this sentence tbh, I mostly interpret it as "Let's use mojmap and fuck yarn, if it's unworkable afterwards well too bad!"

Now let's talk about other points

Ok, Mojmap is very interesting as a common mapping, and as a common intermediary. But as far as I know if you want a common intermediary, it would also require Forge to do it (and as seen a few messages earlier it's still SRG), and there's little to no interest to support that for Spigot as it's inherently broken for modding with the enum spam and an obselete API that should have died lot of versions ago.

Also maybe you want to use mojmap but there's people who don't want and for a lot of reasons, mostly for bad names.

Also, now let's talk quickly about yarn and let's put aside all legal considerations, even if yarn contributors could know Mojmap names, it still goes against the principle of yarn to be a mostly clean-room naming. It has a lot of remnants of known names (from strings or from MCP (as some contributors were using it before)), but they are also disregarded at one point or another, like the NBT package which got a refactor recently to drop all Notch names and have a more GSON-y naming and to be reduce confusion with the datapack Tag. So even though the strings and stuff, yarn is still trying to be a clean-room mappings and I would like it to stay as that. Using Mojmap as intermediary just goes against that.

Conclusion

Even though Mojmap can be interesting, it just doesn't make the cut for me. There are merits, but the cons outweight them, while Hashed Mojmap keeps most of the merits and reduces the cons.

marshoepial commented 3 years ago

A couple of my thoughts:

spaceclouds42 commented 3 years ago

If we do decide to include a second intermediary, would it be possible for modders to choose between it or hashed Mojmap? For example, in a development environment you could possibly use two different Gradle scripts to setup with either the hashed Mojmap or the secondary intermediary. At compiletime these would be swapped back to mojmaps of course, but it would be nice to decide between the two in a development environment.

You would never use either of those in your dev env. In loom, the mappings function sets which mappings to use, which default is to use yarn. However, there could be an added fallbackMappings function added that allows you to use mojmap, hashed, or second(if that gets implemented, though i happen to doubt it), as the fallback for when yarn doesnt have it mapped.. actually now that i think of it, can we please get this, regardless of the second intermediary, that would be a cool thing to have, to be able to set fall back mappings to something else that is fully mapped, like mojmap. Wouldn't be that useful for yarn contributors, but as a person that likes using yarn but doesn't contribute to yarn, i could use yarn and then any unmapped things would use mojmap, or whatever fallback is set, (default being hashed).

For the hashing, a big worry of mine is memorization and recognizability

Same here, and i think fallbackMappings could solve that

marshoepial commented 3 years ago

In loom, the mappings function sets which mappings to use, which default is to use yarn.

Sorry, I was talking about unmapped functions. Having mojmaps be an option for a fallback would be pretty helpful.

CheaterCodes commented 3 years ago

Is there anything stopping Mojang from restricting the mojmap license, or taking away mojmap altogether? I'm not super aware of their complete license so this worry may be unfounded. It's great that they're making their mappings somewhat available but I don't think it's a good idea to rely on them so heavily if they can be pulled out from underneath us. A good alternative would be maintaining a secondary intermediary like @SpaceClouds42 mentioned as long as that intermediary is very automated (I don't want to introduce any greater workload). That way we have a fallback already maintained in case something does happen with the mojmap.

First off, Mojang can at any point stop distributing their mappings. However, their current mappings have a license and they can't simply withdraw that.

Now, in the (unlikely) case that Mojang does that, we can simply freeze hashed mojmap in place and continue using it as a sort of intermediary. Or even automatically remap yarn to a new intermediary. Simple enough, there's no need to start maintaining an intermediary before that happens.

If we do decide to include a second intermediary, would it be possible for modders to choose between it or hashed Mojmap? For example, in a development environment you could possibly use two different Gradle scripts to setup with either the hashed Mojmap or the secondary intermediary. At compiletime these would be swapped back to mojmaps of course, but it would be nice to decide between the two in a development environment.

People can use in dev whatever they want. If it's a popular request someone will do it. It's a good idea to allow for the possibility in the planned gradle plugin, but this definitely shouldn't be a priority right now.

For the hashing, a big worry of mine is memorization and recognizability. This is more "out there", but I know some implementations use a list of short English words (maybe about 2,000 words long and under 4 letters each) and concatenate four or five to represent the hash. Or maybe use SHA-256 fingerprints in some way, obviously not for names but making them available could make a good visual identifier that for example dev tools could show to modders.

First of, it's not that bad. You'll have to at most memorize the first 2-3 digits. In most local contexts even less. Second, if you want to use words, the names get insanely long. 2000 words under 4 letters is hardly better than random letters (there's only 17000 ways to arrange 3 letters in the first place) and for total length of sensible words you end up at around 4-5 words from a really big dictionary.

We had a discussion in discord, if I find the time I might put up some stuff in this discussion.

Earthcomputer commented 3 years ago

Please, before commenting, read the RFC. Some people are making points that are already discussed/rebuked in the RFC itself. If you disagree with the rebuke in the RFC itself, you should address that rebuke rather than repeating points people have already made countless times.

CheaterCodes commented 3 years ago

One thing I'd like to know about this is if hashed-mojmap would be hosted by quilt. I.e. will there be mappings published or should users generate them locally as needed?

While an automated system pushing the new mappings every release is only a small amount of work to maintain, having no system at all would arguably be less work. My first thoughts:

Pros:

OroArmor commented 3 years ago

I don't think the process should be fully automated. It should not have to be as manual as matcher is, but if someone has to click a button Generate new Hashed-Mojmap and then sees that there is an error, they can go in and manually correct that. This could also tie into LambdaAurora's concerns about having net/minecraft/a/Foo then adding net/minecraft/b/Foo, where there could be some form of a file between versions to maintain compatibility.

Something simple like:

net/minecraft/a/Foo CLASS_NAME_ONLY
net/minecraft/b/Foo FULL_PACKAGE_NAME

could work. This would probably also never be used.

In the case that Mojang changes a name without changing functionality

net/minecraft/a/FooRenamed HASH_FROM_NAME net/minecraft/a/Foo

could also be done to improve inter-version compatibility.

LambdAurora commented 3 years ago

Bit scared that if this isn't fully automated it would remove some of its benefits.

CheaterCodes commented 3 years ago

The mappings themselves should definitely not be "fixed manually". If there is some error, it has to be fixed in the toolchain. If we start to manually fixing hashed mojmap, we end up just maintaining another intermediary.

OroArmor commented 3 years ago

That does make sense, and I guess I'm partially trying to be the devil's advocate to make sure every option is thought of. A fully automated process would be nice, but in the case there is a critical issue, and the person who created the system is gone, there could be a serious issue.

TheGlitch76 commented 3 years ago

Can Clients/Servers get a complete enough mapping on their own? (I.e. without a merged jar/mapping)

@CheaterCodes For 1.17+: The client has no concerns here as it is the authority. For the server on 1.17+ and everything else we'd need to investigate how Tiny Remapper would handle it.

skyrising commented 3 years ago

Even if it's possible for instances to compute it themselves it's still better to compute it once and host it centrally to avoid wasting energy on thousands of computers and to save on some startup time.

TheGlitch76 commented 3 years ago

I agree with skyrising here--the important part is that anyone can generate hashed mojmap with publicly available tools and they will be the exact same as whatever Quilt produces. If Quilt's maven stops being reliable for downloading mappings, there are much bigger issues that projects using our tools will need to deal with (for example, old versions of Loader are not able to correctly parse versions of snapshots of major release cycles that started after their release)

As for the update times: we already need to figure out a better solution for detecting when a new minecraft update is released for our discord server, so we could tell whatever that does to trigger a webhook in Github Actions too.

chexo3 commented 3 years ago

This reminds me of another potential issue. What about people maintaining Quilt for versions that don’t have MojMap (IE: Legacy Fabric 1.8.9 and Cursed Minecraft Legacy for b1.7.3) or if Mojang stops providing mappings? Is there a good alternative to hashed mojmap?

Earthcomputer commented 3 years ago

Legacy Quilt can continue maintaining their own intermediary, as they have had to do already. If Mojang stops providing mappings we will start matching ourselves again, with the existing hashed mojmap staying and new names being generated a different way. However I see this scenario as very unlikely.

chexo3 commented 3 years ago

Wouldn’t it be easier to do something now that wouldn’t break anyways? I have a few ideas for that but they’re not ironed out yet.

Earthcomputer commented 3 years ago

Let's not plan around things that are extremely unlikely to happen.

OroArmor commented 3 years ago

Intermediary can be in any format, and is provided via yarn for remapping in dev (I think). Any version of intermediary can be used and mappings can be build on top of them. There is no need for supporting old versions, especially that old with barely any mods.

CheaterCodes commented 3 years ago

There exists now an implementation of this at https://github.com/QuiltMC/mappings-hasher Note some differences:

CheaterCodes commented 3 years ago

I updated the PR to reflect the changes done while implementing. This is mostly just the change in base-62 to base-26 and the prefixes. I also removed the mixed-case drawback since that's no longer applicable. LGTM?

Haven-King commented 3 years ago

Moving this to final comment. The final comment period will conclude on July 28th. Anyone who hasn't had a chance to voice their concerns yet, please do so 🙂

HowardZHY commented 1 year ago

The point of mappings is obf 2 human-readable, NOT obf 2 another kind of obf.

OroArmor commented 1 year ago

The point of mappings is obf 2 human-readable, NOT obf 2 another kind of obf.

This is making an mapping set that is consistent across multiple minecraft versions as the mappings for the minecraft jar often change.