Wilm0r / giggity

Generic Android conference schedule viewer
https://gaa.st/giggity
GNU General Public License v2.0
101 stars 104 forks source link

Pretalx files can contain events with non-unique id= values, consider switching to guid= #252

Closed waffshappen closed 1 year ago

waffshappen commented 1 year ago

Adding the Nordic Fuzzcon Schedule from https://events.nordicfuzzcon.org/nfc-2022/schedule.xml works mostly well enough but with a weird bug.

When selecting "Mr Fleischwolf's Factory of Fantastical Fun" on the 22 Feb it will instead open the Event viewer for the 25th - and setting a Reminder for it will also highlight the 25th on the Event Schedule. Same for the "Fursuit Mini Lounge" or "Waffle Station".

Wilm0r commented 1 year ago

This is kind of WAI: Note that "Mr Fleischwolf's Factory of Fantastical Fun" exists several times (7?) but has only two unique id= values: 101 and 281, a few times each.

I see truly unique guid= values but I don't know when those got added to the Frab/Pentabarf spec. Giggity could switch to use those instead, but I don't know how common they are. It is on https://github.com/frab/schedule.xml/blob/master/schema/schedule.xml.xsd at least.

Wilm0r commented 1 year ago

(Really wondering what the point having both is though, and why one would allow non-unique integer id's at all..)

waffshappen commented 1 year ago

(Really wondering what the point having both is though, and why one would allow non-unique integer id's at all..)

They are using pretalx for it so that probably has a feature for this?

I can only imagine that they added the event once in some way - since all these are hosted by the same person and do the same, its just a slow event that only allows a few attendees at a time over a multi-hour window so its been set up over multiple days so everyone has a chance to enter.

Thus they might have the main event (same by id) then duplicated across days with a uniq from pretalx each time? This way they could change the description on one and have it across them all too.

Thats the only thing that would make sense to me but if file formats would ever make sense that'd be too easy.

Maybe it'd make sense to make the app check for both id (if exists) + guid (if exists) and combine them into an identifier?

Wilm0r commented 1 year ago

but if file formats would ever make sense that'd be too easy.

Ha-ha, exactly. :( Let me show what I've found so far... This is from the schema I found earlier:

    <xs:unique name="guid_unique">
      <xs:selector xpath="day/room/event" />
      <xs:field xpath="@guid" />
    </xs:unique>

    <xs:unique name="id_unique">
      <xs:selector xpath="day/room/event" />
      <xs:field xpath="@id" />
    </xs:unique>

    <xs:attribute name="id" type="xs:positiveInteger" use="required"/>
    <xs:attribute name="guid" type="uuid" use="required"/>

Not like I understand XML schemas but this looks like Frab expects both of the IDs to be unique. Why have both? Oh well.. But also, amusingly it wants id= to be integer, and well, I've just downloaded all the XML-formatted scheduled from Giggity's menu entries and found several examples of the id= field being formatted like a UUID (with dashes or sometimes without).

Oh hah and flosscon_2022 has guid= values like 3vHjol7AGaxVbxYLXSaAXw. Maybe just the same thing but base64-encoded? $shruggie ... At least I hope nobody would ever enforce Frab's schema on scheduled output by anything else.

Ah and this is from Pretalx' XSLT (?) :

            {% for talk in room.talks %}<event guid='{{ talk.uuid }}' id='{{ talk.submission.id }}'>

So talk.submission vs. talk submission instantiation or so? Definitely wondering whether your conf organisers intentionally submitted this event twice and made separate copies of both. At least this is the first time I encounter this. I'm kinda curious now whether I've got duplicate id= entries in the menu somewhere already, but not prepared to grep my way to an answer there..

Maybe it'd make sense to make the app check for both id (if exists) + guid (if exists) and combine them into an identifier?

Hmm. Probably not much value in using both, I think I'll go for using guid= when available, and id= otherwise. While it'd suck if at some point an existing schedule adds or removes them, causing all user selections to become invisible, that just doesn't seem very likely at all...

(And let's have the parser do this only for events scheduled after today or so, again to make sure no selections get lost. Hm. Except then there's the issue of people still running Giggity 2.0.8 and perhaps not upgrading fast enough. Yay isn't backward compatibility FUN?)

Wilm0r commented 1 year ago

Oh yes and full-on switching to guid= atts won't work either because for example even this year's FOSDEM schedule didn't have them yet:

wilmer@veer:~/android/deoxide/menu/frab$ for f in *; do grep -qw guid $f || echo $f; done
bibtag22.json
fosdem_2019.json
fosdem_2020.json
fosdem_2021.json
fosdem_2022.json
fosdem_2023.json
minidebconfindia2021.json
oat19.json
oat22.json
pgconfeu2022.json
pgdayparis_2022.json
promconeu-2019.json
sfscon_2020.json
sfscon_2021.json
sfscon_2022.json
tuebix_2019.json
wikimania_2019.json
waffshappen commented 1 year ago

At least I hope nobody would ever enforce Frab's schema on scheduled output by anything else.

I can only imagine that someone will point you to these very Words in a few Years

So talk.submission vs. talk submission instantiation or so? Definitely wondering whether your conf organisers intentionally submitted this event twice and made separate copies of both. At least this is the first time I encounter this.

I'm going to try and point them to this issue - maybe they can give insight to their specific setup?

Hmm. Probably not much value in using both, I think I'll go for using guid= when available, and id= otherwise. While it'd suck if at some point an existing schedule adds or removes them, causing all user selections to become invisible, that just doesn't seem very likely at all...

Maybe a migration once would make sense if guid is found to check for old entries that are bound to an id, then mark the guid instead? And this could absolutely happen - if this is a recent update of pretalx to allow this specific behaviour any Convention updating their instance soon might loose all saved user entries this way otherwise.

Wilm0r commented 1 year ago

Yeah, I'm going to try that .. prefer guid= where available, and at load time, perform conversion if needed. That means touching the most fragile file across all of Giggity (Db.java is terrible), but fingers crossed. :-)

BTW as I'm writing this I'm already finding interesting unittest failures on the 36C3 schedule. Guess what, it uncovered a bunch of dupes there too:

wilmer@ruby:~/android/deoxide$ grep -ow 'event id=[^ ]*' app/src/test/resources/36c3_merged.xml | sort | uniq -d
event id="347"
event id="367"
event id="390"
event id="883"
event id="884"
event id="885"
event id="886"
event id="887"
event id="890"
event id="891"
event id="892"
event id="893"
event id="894"

And to make things more interesting, they're not quite as obvious as your example:

      <event id="892" guid="76456644-b7e6-5dc0-bd9e-3350a42a47b0">
        <logo/>
        <date>2019-12-28T23:00:00+01:00</date>
        <start>23:00</start>
        <duration>01:40</duration>
        <room>Sendetisch</room>
        <slug>JHU7UE</slug>
        <url>https://fahrplan.das-sendezentrum.de/36c3/talk/JHU7UE/</url>
        <title>Binärgewitter Talk</title>

and

      <event id="892" guid="5ba2eccb-c550-5d46-a8a1-2ca211fe59a6">
        <logo/>
        <date>2019-12-30T14:00:00+01:00</date>
        <start>14:00</start>
        <duration>01:00</duration>
        <room>cinopolis</room>
        <slug>JUAUZT</slug>
        <url>https://talks.komona.org/36c3/talk/JUAUZT/</url>
        <title>Subtitlers POW WOW!</title>

Different room, different day, different speaker, different slug, well everything...

Thanks for reporting this one. This is bizarre.

.... OH BUT WAIT: Yes, this is the enormous CCC merged schedule ("official" + self-organised talks?), so I imagine that's where this all went wrong.

Wilm0r commented 1 year ago

... TIL that github parses changelog messages and autocloses bugs when deemed appropriate. Neat, I guess. :)