hughsie / appstream-glib

This library provides objects and helper methods to help reading and writing AppStream metadata.
GNU Lesser General Public License v2.1
65 stars 103 forks source link

Allowing passing unknown attributes #431

Open Jehan opened 2 years ago

Jehan commented 2 years ago

as_release_get_description() and other similar API will clean out the returned XML from any unknown attribute. While it is understandable as a default, I would like to propose we get a way to get a description with unknown attribute.

Context: in GIMP, we are starting to use the AppStream data to show the release notes at first launch when a new version is installed/updated. Since the data was already here, localized and all, it seemed a good way to use it. I want to go further by adding some internal data which allow us to link a release point to specific GUI elements (people could then click a release point and the GUI element would blink). And basically how I want to do it is by adding some attribute to the release points. Unfortunately I realized these were removed by appstream-glib.

Note that we can for instance just store this data in yet another file, maybe with an order relation. But it's less foolproof. If we add new release items in the middle or reorder the items, we'd always have to make sure we don't break the relations. The real good and clean solution would be that appstream-glib allowed us to get uncleaned release items.

hughsie commented 2 years ago

Got any idea for an API?

Jehan commented 2 years ago

I see 2 ways:

  1. Either we could have a separate API as_release_get_raw_description();
  2. or since the AsRelease is a class, it's stateful. So the second solution could be add a settings API. Something like as_release_allows_unknown() (I'm not sold to this API name! Naming is hard!), then any call to normal as_release_get_description() would get raw description (with all unknown attributes).

Normally I'd think the second idea is nicer because we reuse the existing API, but since it looks like description is the only contents which would be concerned by this, the first idea is probably better.

hughsie commented 2 years ago

Thinking about this some more; isn't putting unknown tags in to the description metadata going to make it fail the appstream validator?

ximion commented 2 years ago

Thinking about this some more; isn't putting unknown tags in to the description metadata going to make it fail the appstream validator?

It absolutely will, but I think unknown attributes will just be ignored and just not rendered (and I think that's true for libappstream and libappstream-glib).

Jehan commented 2 years ago

In my case, it's not tags, only attribute, and it doesn't fail in appstream-util (even in the validate-strict variant). This was the first thing I verified. ;-)

I haven't tested other validators. Are there any? (I know there is an appstream validator on flathub too, I'm not sure what tools it uses and have not tested this on Flathub, though I should)

This being said, adding unknown tags would be interesting too, because it would allow to add translatable text using the existing XML-aware gettext infrastructure (stuff like <_p>some translatable text</_p>). I just tested. Adding random tags to the AppStream file does not break appstream-util validate-strict either.

After all a great advantage of XML is its extensibility (it's even in the name!), i.e. that you can add more types of custom nodes, attributes of whatelse for custom needs. A specific purpose parser should not break on unknown tags or properties (or ideally it should have several modes, so that you can choose if you want to break on, ignore/discard, or pass along the unknown tags/properties). Most XML formats work this way (I used to do a lot of things with XMPP and one of the main point was its extensibility to communicate organized data with sub-specifications).

Right now, I am pre-processing the AppStream file with a generic XML parser to generate a header with the info (since I could not use appstream-glib). This works very well of course and possibly I'll stay like this now that I did this work. But long term, I think that it would be a good idea to allow passing the unknown data along (as an option). As I said, I believe a good XML-format parser should allow such usage.

For the record, this is the kind I think which GIMP does: https://twitter.com/zemarmot/status/1500781782857392130

And it does it from this single "demo" attribute in our AppStream file (inside the <release> tag):

          <_li demo="toolbox:bucket-fill,
                     tool-options:fill-area=2,
                     tool-options:line-art-stroke-border=1">
          New "Stroke borders" option in Bucket Fill tool's "Fill by line art detection"
          </_li>
ximion commented 2 years ago

I haven't tested other validators. Are there any? (I know there is an appstream validator on flathub too, I'm not sure what tools it uses and have not tested this on Flathub, though I should)

Yes, the reference implementation, appstreamcli validate --pedantic --explain myapp.metainfo.xml - Ideally test with that too, at the moment it understands quite a lot more of the AppStream specification (but you likely don't use this features, or else appstream-util would likely complain).

The libappstream policy for unknown stuff is to flag it as bad when validating (after all it may be a misspelled tag, the person is trying to use a feature that isn't supported by the implementation yet), strip such data when exporting (we only export pristine AppStream data, ideally) as software centers must be able to understand the generated data, but keep extra attributes when loading data. In general, I much prefer to add generally useful features to the specification properly for everyone to use rather than having people hack around limitations in AppStream, or have people make use of the custom and tags tags for vendor-specific custom features. Your feature falls into a quite unique category, as it's both pretty specific to your usecase and you can't make use of the custom tag for it.

ximion commented 2 years ago

Btw, did you consider using Itstool for translation? Using plain xgettext with AppStream's ITS rules may also work for many usecases (I even added a helper for this to newer Meson versions to make translating AppStream data easier: https://mesonbuild.com/i18n-module.html#i18nitstool_join ). The advantage of using this scheme is that you don't need to prefix tags with underscores and the source XML as well as the translated XML validate exactly the same. It's also IMHO a bit nicer to read, but of course that's subjective.

Jehan commented 2 years ago

Yes, the reference implementation, appstreamcli validate --pedantic --explain myapp.metainfo.xml - Ideally test with that too

I confirm that this tool does not complain on unknown attributes, so it's cool on this side (regarding my current implementation). But it does error out on unknown tags.

In general, I much prefer to add generally useful features to the specification properly for everyone to use rather than having people hack around limitations in AppStream, or have people make use of the custom and tags tags for vendor-specific custom features. Your feature falls into a quite unique category, as it's both pretty specific to your usecase and you can't make use of the custom tag for it.

Yeah I am not pushing this as a feature proposal. It's obviously specific to our program and how it works. ;-)

I just think it's a nice idea to be able to add additional data (as long as it doesn't break other parsers; it could even be in separate XML namespaces if needed be). The reason why I thought to reuse the AppStream is that I realized we already have a list of release items, and it's even already translated by our translator teams. So it just made sense to tie this to the AppStream data. And while doing this, avoiding out-of-sync data requires to be in the same file.

Btw, did you consider using Itstool for translation? […]

Sure looks interesting. Now I wouldn't work on this though I would accept a patch for this if it's considered the way forward (is the old _ prefixed tag solution considered kind of obsolete now?). I have enough stuff to do already not to create myself more work for something which just works right now. :-P