hughsie / appstream-glib

This library provides objects and helper methods to help reading and writing AppStream metadata.
GNU Lesser General Public License v2.1
65 stars 103 forks source link

"Invalid URL type" error does not inform which type is invalid #392

Open suve opened 3 years ago

suve commented 3 years ago

Consider the following minimal example:

<?xml version="1.0" encoding="UTF-8"?>
<component type="desktop">
    <id>example.desktop</id>
    <metadata_license>Unlicense</metadata_license>
    <description><p>Text goes here.</p></description>
    <url type="privacy_policy">https://duckduckgo.com/privacy</url>
</component>

When passed to appstream-util validate-relax, this produces the following error message:

example.appdata.xml: FAILED:
• tag-invalid           : <url> type invalid [unknown]

This message is not very helpful, since it doesn't inform the user which type is invalid.

Looking briefly through the code, this is caused by as_app_add_url() not taking the URL type as a string, but rather, as an AsUrlKind enum. Hence, when as_app_node_parse_child() calls that function, it first converts the type to an AsUrlKind enum by calling as_url_kind_from_string(). Then, as_app_add_url() converts the enum back to a string representation by calling as_url_kind_to_string(). This works fine for valid URL types, but when faced with an invalid one, the "convert to enum" step causes loss of data, since all invalid types are coerced to AS_URL_KIND_UNKNOWN. When that is converted back to a string, it always yields "unknown".

One way of fixing this would be to change as_app_add_url() so it takes the url type as a string, not as an enum. On one hand, this is simple to do, since as_app_add_url() converts the enum to a string representation, either way, and then uses that as a hash table key. On the other hand, as_url_kind_to_string() returns pointers to static strings, which don't need to be freed, and changing to use each URL's type= attribute as the hash table key would mean needing to keep track of when to free these strings. Either way, if one were to go with this solution, then during validation, one could just have some is_url_kind_valid() function and use that, instead of the strcmp(key, "unknown") logic present currently.