linebender / norad

Rust crate for working with Unified Font Object files
Apache License 2.0
43 stars 12 forks source link

`LayerSet` is ordered when it is not implied or documented as such #326

Closed RickyDaMa closed 11 months ago

RickyDaMa commented 1 year ago

A set is traditionally unordered, so one would expect LayerSet to be unordered. The fact that it is actually ordered wouldn't matter if it wasn't for the fact that its PartialEq implementation relies on ordering

This came up as a frustrating footgun while I was fixing a test within a bigger project, and until I questioned norad and looked into its internals I didn't know why I could compare individual layers and have them all be equal, but have two mismatching layer sets

Suggested solutions:

  1. Making LayerSet actually unordered (though I daresay there's some reason why it is ordered)
  2. Documenting that LayerSet is ordered, preferably changing it's name to not mislead
  3. Make the PartialEq implementation be order-insensitive. This may be challenging without incurring allocations
anthrotype commented 1 year ago

I agree LayerSet is a bit of a misnomer.. layers in UFO are in fact ordered, layercontents.plist is an array of (layer_dir_name, layer_name) so the the order does matter.

This list also defines the order of the layers from top to bottom.

https://unifiedfontobject.org/versions/ufo3/layercontents.plist/

anthrotype commented 1 year ago

although it is true that in ufoLib2 the LayeSet.__eq__ (automatically generated by attrs for us) only compares the inner _layers attribute which in turn is defined as a Dict[str, Layer] so... python isn't actually considering the (insertion) order of the dicts when comparing them. Maybe it should, if LayerSet is supposed to model the layercontents.plist.

anthrotype commented 1 year ago

I'd vote for

  1. Documenting that LayerSet is ordered, preferably changing it's name to not mislead

The "LayerSet" name probably comes from ufoLib2, which in turn inherited it from defcon, and the latter probably from robofab? Each successive iteration of the ancestral UFO API attempted somehow to maintain the old names for backward compatibility sake so they can be used as drop-in replacements. It's probably too late for a name change.

RickyDaMa commented 1 year ago

Thanks for the history behind it Cosimo, that's good to know! I didn't know the wider context to know which fix would be best, hence putting it into an issue

Except for not matching ufoLib2 (and any potential confusion that may cause), I don't see a particular issue with changing the struct name. The field on Font is called layers, and that's the only place you see LayerSet in the public API. You also (should) never want to create one directly in code, instead manipulating the already present one*. Essentially the name "LayerSet" is never going to be seen in code of developers using this library, so changing the name is low impact in that sense

(* this is something I'm working on writing an issue about, but needs a lot more time than this quick annoyance I spotted)

cmyr commented 1 year ago

Just to add my two cents: conceptually the only strong implication of the term 'set', to me, is that items are unique. The fact that hashsets offer no ordering guarantees is an implementation detail, but it is entirely reasonable for sets to use a consistent ordering, such as is the case with the BTreeSet in rust's std.

I definitely agree that it would be nice to mention this in the docs.

RickyDaMa commented 1 year ago

From Wikipedia (not the be all and end all, but still):

a set is an abstract data type that can store unique values, without any particular order

At an ADT level, sets are unordered. Thus, unless explicitly stated, I think most would assume the order is arbitrary and not something to be relied upon. BTreeSet has it in the name: it's backed by a b-tree, therefore I know it's an ordered collection of unique values

I'm not wanting to split hairs here, just would like to improve the situation for future users so they don't have the same experience I did

rsheeter commented 1 year ago

Naming is the hardest part ;)

if LayerSet is supposed to model the layercontents.plist

Naively, might we call it LayerContents to make this relationship more explicit?

RickyDaMa commented 1 year ago

Naming is the hardest part ;)

So true it hurts 🥲

Naively, might we call it LayerContents to make this relationship more explicit?

I would be all for that. Happy to open a PR for whatever solution we settle on

cmyr commented 12 months ago

Okay I think this makes sense, LayerContents better matches the spec in any case. I also agree that the docs here should be more explicit (and should link to https://unifiedfontobject.org/versions/ufo3/layercontents.plist/)