TabbycatDebate / tabbycat

Debating tournament tabulation software for British Parliamentary and a variety of two-team parliamentary formats
https://tabbycat.readthedocs.io/
GNU Affero General Public License v3.0
242 stars 806 forks source link

Make motion->round a many-to-many relation #645

Closed tienne-B closed 3 years ago

tienne-B commented 6 years ago

Some tournaments re-use motions for many rounds, such as to let teams debate both sides of a prepared topic. Due to the one-to-many relation between rounds and motions, the motion statistics do not take advantage of the complete set of data available, splitting it between various rounds. As a model, I propose:

Motion motion_rounds round
id id (as required) no change
text motion_id
reference round_id
info_slide seq

flagged

Thoughts?

czlee commented 6 years ago

Provided there's a demonstrable need for this, I'm fine with it in principle (unless @philipbelesky objects). It'll be a hefty refactor, but that's okay. Django has a ManyToManyField for this purpose that might be useful, though to include division and seq we'll need a custom intermediate model. You might notice that for DebateTeam and DebateAdjudicator we eschew this approach completely and just define our own table and do all the relations ourselves—I'm fine with this too, and it's not obvious to me whether this is preferable.

The main stipulation I have is that this shouldn't complicate any of the workflow for existing use-cases at all, not even by one extra click. At the moment, we already support one motion or multiple motions using the (poorly named) "enable motions" option, and there's some minor gymnastics in setting up the UI to handle both cleanly and making sure it doesn't crash if you switch from one to the other mid-tournament. So this might be a little more complicated to do cleanly than first appears. I'd want to hash out how we expect the UI to work with this before we start.

tienne-B commented 6 years ago

motion_rounds would be the intermediate model as it contains the sequence. I did not include divisions as they are already dealt with by a many-to-many relation. (That was a typo...) In adding motions for debate, there could be an list of previous topics that can be selected from, on top of inserting new ones. However, the main use-case is consecutive rounds, where a button to re-use the ones from the last round (like for venues, adjs, and teams) would in my view improve the workflow, reducing the number of clicks, not increasing :)

But this is getting complicated I know...

czlee commented 6 years ago

Can you tell me a bit more about the use case? Does this mean teams debate the same topic many times, like in the US Lincoln-Douglas and policy formats? Is it always exactly twice? Is the fact that these motions will be repeated known ahead of time?

The current system is designed on the assumption that tab directors discover the motion at the same time everyone else does—15 minutes (BP) or 30 minutes (Australs) before the debate starts. (This is why, for example, entering the motion is accessed via the "display" page.) For fully-prepared tournaments, all motions can be entered ahead of time, but there's no "nice" UI for this (you'd just use importtournament or enter them into the database). So I'm wondering if the use case you have in mind breaks more than just one design assumption we currently have?

tienne-B commented 6 years ago

Yes, this means that teams debate the same topic many times, twice consecutively, taking the other side from the previous. It is known before-hand whether topics will be repeated.

The current assumption that the tabulation team discovers the motion at that time is not necessarily wrong, and should not be affected by this refactoring. The current problems and inconveniences of using the one-to-many relation is that the same motion has to be inputted again, which takes more time (and bad in db normalization). The use of the one-to-many also affects motion statistics as it does not treat these motions as identical (and thus use all the available data)

We can still treat a motion->tournament relation as one-to-many, and it could be possible to use the reference codes to add existing motions to the round (in the edit motions formset, if an already-existing reference is inputted, assume it's from a prior round and autofill the motion/infoslide)

philipbelesky commented 6 years ago

@czlee if you want to enter in motions ahead of time can't you add/release them through the normal UI once rounds are input?

I would also emphasise that this option probably shouldn't really complicate the standard motion/editing workflow and its assumptions. It seems like it would be perfectly viable as an option that requires users to setup motions as such in the in backend?

It is a relatively heavy refactor just to enable motion statistics and remove some data entry hassles. As such I'd be very keen that it comes with a pretty comprehensive set of tests to help avoid regressions.

tienne-B commented 6 years ago

Could the enable_motions option be deprecated/removed, looking for the number of motions in the round and to not require motion input if there's just one? The statistics page could then be by motion, with the applicable rounds attached to the motion (rather than the inverse).

philipbelesky commented 6 years ago

Hmm, so conceptually enable_motions is to implicitly tied to formats where there is motion selection, which is slightly different to just having multiple motions available although I can't recall any circumstances off the top of my head where they don't coincide. Having it there does mean we can tailor the UI to the format (i.e. only asking for a single motion in the first place) although I'm not opposed to removing that or figuring out some other way of adapting the UI. @czlee can you think of a circumstance (or think its worth preserving the option) where you would want to show multiple motions but not have the ballot ask to record them?

tienne-B commented 6 years ago

As future tasks once merged:

philipbelesky commented 6 years ago

Is there a need to merge before those future tasks? For sequence in particular I don't think we can release this change without that functionality so I'd prefer it land all at once.

czlee commented 6 years ago

@czlee can you think of a circumstance (or think its worth preserving the option) where you would want to show multiple motions but not have the ballot ask to record them?

I'm not aware of any tournaments that would want something like this. I can see why a tournament might want to show motions but not collect information on them, but I don't feel strongly about supporting this.

That said, we also use the enable_motions option to determine whether to show one or three motions in the "edit motions" view, and whether to say "Edit Motion" or "Edit Motions". Also, I'm a little uncomfortable about inferring it from the number of motions, because it would forgo a check against human error: if (say) in Australs, the tab director forgets to enter the motion or by accident only enters one, I want something to look awkwardly wrong on the results form. If we're inferring intent from number of motions, it'll be too easy not to notice the mistake. I feel like the right place to specify such an element of tournament format is in the tournament configuration.

@tienne-B: I'd still like to learn about the use case for this feature in somewhat more detail. Would you please describe the tournament(s) you're familiar with that requires this? I mean how the tournament works, when motions are revealed, why they do it, and I mean in debating terms, not software terms. Am I right in believing there's only one motion per round in this format? I got a bit confused about how it can be simultaneously true that the motion is released only shortly before the round starts, but is used for two rounds. Or if it's a fully prepared topic, how does the tab team not know what it is ahead of time? Is this a one-debate-per-week league or a four-debates-in-one-day tournament? I also wasn't sure whether you meant "many" (more than two) or "exactly two".

Sorry about the many questions: This is a major structural change (as @philipbelesky notes at #739), so I'd like to be sure it does exactly and only what it needs to do to be fit for purpose. It can be tempting to discuss this purely in terms of software requirements, and I appreciate the efforts to minimize the apparent impact on the software architecture, but an understanding of the debating context in which the software operates is important.

tienne-B commented 6 years ago

So point 2 (https://github.com/TabbycatDebate/tabbycat/issues/645#issuecomment-412312781) is a blocker. There is no need to merge before those tasks, but point 1 is not required for it. I'm just having difficulty with adding the sequence field as it is in another model. The solution is to make the formset point to RoundMotions, but I can't find a way to include the fields from Motion. The sequence is inferred by the position in the formset, so if only it were possible to re-order (drag/drop) the items in the set :)

Noted for enable_motions.

There are two types of tournament that uses repeated motions. Both of them re-use motions in sets of 2 rounds, with the goal of having every team debate both sides of a topic. In the first type, this is only applicable to the first two rounds, which are prepared. In the next type, all rounds are non-prepared, and the whole tournament re-uses motions by two. In break rounds, the teams choose which of the existing motions to debate. In the first type, it is not required to "reveal" motions, but it is necessary to reveal motions for odd-numbered rounds in the other form. So these are more in the "four-debates-in-one-day" category. So by "many," it is two in the first use-case presented, and it is more than two in the second.

No problem about the many questions. Expected it. The two cases I presented do show that a ManyToMany field is more appropriate for the relationship between Rounds and Motions, but problems do arise when refactoring after years of work on another relation. I hope this demonstrates the debating context. Open to your 3 other areas of enquiry now :)

czlee commented 6 years ago

I see. So in the non-prepared tournament, every motion is used exactly twice—motion A for rounds 1 and 2, motion B for rounds 3 and 4, which means that a motion is "revealed" for limited-time prep in odd-numbered rounds, but in even-numbered rounds teams basically know (and know that they know) the motion ahead of time?

My "more than two" question might have lacked context—what I meant was, are there ever motions (in the tournaments presently considered) that are used for more than two rounds, and are they ever used in two rounds that are not consecutive?

(Example of context: US high school leagues run a single resolution for two months or an entire academic year. It might seem at first like this qualifies as "many rounds to a motion", but actually it just means that tab doesn't care about the resolution, not even for information's sake, because it's well known by all ahead of time and the correct source is that link above. So in this case the solution is just not to handle motions. This example is hypothetical, because we don't really have any interest in supporting these formats, at least not currently, but hopefully it gives some idea of my line of thought. To avoid doubt, it's not an analogy or comparison.)

[I'll get to the sequence and database schema stuff once I think I understand the debating side 🙂]

tienne-B commented 6 years ago

I'd get bored debating the same topic for months on end :)

And yes, in my previous example, break rounds also re-use the motions from all preliminary rounds, and so the motion re-use is not always consecutively.

czlee commented 6 years ago

I see, but it's team choice? So in a six-round tournament breaking to quarters, motion A is released just before round 1 and everyone does it for rounds 1 and 2, motion B is released just before round 3 and everyone does it for rounds 3 and 4, motion C just before round 5 and everyone does it for rounds 5 and 6, and then for the quarter-final (and every subsequent break round) no motion is released and teams choose among motions A, B and C?

tienne-B commented 6 years ago

Yes (except the prepared motion-if there is one-may not be chosen for break rounds). It would still be good to be able to associate the motion with many rounds (and the team-choice debates) though :)

czlee commented 6 years ago

So for this to work you would have to link every motion to every break round, whether or not it gets chosen, right? Are you trying to keep a record of who chose which motion? Are break rounds included in motion statistics?

tienne-B commented 6 years ago

Well, a subset of the motions. And yes, to keep a record of who chose what motion. Yes, break rounds are included in motion statistics (not in the BP format though)

czlee commented 6 years ago

At this point I'm wondering if it even makes sense to link break rounds to motions at all? Like, a motion is just an object, and you can always (in principle) link a motion to a debate, regardless of its round. The only reasons we attach motions to rounds are (1) in Australs and NZ Easters, to provide just those choices on the ballot form, so you don't have to scroll through irrelevant motions; and (2) to group them neatly on the public motions and motion statistics pages. And (1) is important only because, of the 24 preliminary round motions in the tournament, only 3 are applicable that round. I'm less convinced about the importance of restricting it to, say, 3 out of 4 motions in the tournament (if there's 1 prepared motion and 3 non-prepared motions).

Similarly, if during preliminary rounds, the structure is always "A for 1-2, B for 3-4", and so on, and non-consecutive uses start only in the elimination rounds, it seems more convenient just to have an option that will do that work automatically (i.e. for even rounds, automatically pull in the preceding odd round), rather than have to link motions in the database according to a pre-defined well-known consistent structure? Otherwise, if we make it more generic, you have to do all this special case handling with some motions having multiple rounds and some rounds having multiple motions (because it's a many-to-many field!), motions not all showing up in the same number of rounds, motions not being attached to debates when motion choice is disabled, and what order you show the motions in when the motion-round link structure is essentially random. It'd take a lot of work to keep that presentation intuitive in all combinations of cases. But in this case probably all you want is just a list of motions ordered by round, labeled "Rounds 1 and 2", "Rounds 3 and 4", etc., with combined statistics for the rounds relevant to each motion?

tienne-B commented 6 years ago

I don't even think it makes sense to directly link motions to rounds to begin with! It is a convenience for easier input as you say. It should be made more generic to allow for any sort of motion<->round madness there is out there :) Thankfully the way ManytoManyFields work in Django, there is not much "special case" logic. And I don't take the link to be essentially random; the ordering can be changed, or what did you mean by that?

It seems to be that motion statistics are the main "bone" of contention. I like it because I'm a "data nerd," but I don't see it's use for decision-making inside a tournament. Mind elucidating me on that? It seems to be for two things: how many teams chose which motion in the round, and then whether it is balanced (chi-squared test). Previously, the data used for them could be the same, but now while the motion-round statistics need RoundMotions objects and motion statistics (chi-squared) just needs Motion objects. My patch disrupted motion-round analysis, and I'll revert that. What I'll do is just have the motion items duplicate between rounds (proportions of winning sides would be similar round to round), but use the complete set for chi. Would that be better? (It should also make the code nicer :) )

philipbelesky commented 6 years ago

Just speaking to the point of the motion tab: its primarily use is for after the tournament and to (nominally) aid in future decision-making (i.e. future motion setting) by displaying what the balance was and (if in a relevant format) how often it was chosen. I say nominally because it is dangerous to over-interpret these for many reasons but tbh it is interesting information to just show to the general public for curiosity's sake.

There probably isn't much use within-rounds aside from CAs just wanting to check how their motions 'performed'. There might be the rare case that CAs are perhaps planning a future motion with a similar subject matter or phrasing where they could be informative, but I doubt that happens very often.

czlee commented 6 years ago

So, I think you're right—motions should only be linked to rounds insofar as there's an actual purpose for it. In our case, there is: the convenience of not having to scroll through 21 irrelevant motions (at Australs) is important. Also, in most parliamentary styles (including BP, Asians and all Oceanian styles), people think of motions as being "for a round", so it helps to present it as such when presenting motions.

To see what I meant by special case logic, consider the following motion-round structure (this is what I meant by "random", but "random" was a poor choice of word, I just meant "not following an obvious structure"):

Motion Rounds
A 1
B 2
C 1, 3
D 1, 2
E 3

How do you display the motions, on any page that lists motions (motion statistics, public motions list)? Do you display by round? If so, do you display motions that were in both rounds twice, or only under the first appearance? You can resolve all this however you want, of course, but some obvious answers get counterintuitive with simpler cases.

For example, you might want to display just a list of motions, not separated by round, ordered however, with little badges (like the ones you have now) indicating which rounds they were used in (for all of them). But in tournaments with three motions every round, the lack of visual separation between rounds makes the page harder to parse.

Or you might want to always enforce visual separation between rounds, with motions appearing only under their first round. But then, would it be confusing that motions A, C and D are "grouped together"? (We already hint at this in https://github.com/TabbycatDebate/tabbycat/pull/739#issuecomment-412321273.)

Or, maybe you want to enforce visual separation between rounds, with motions appearing under every round in which they're debated. But then say you have a tournament structure: A/1-2-3-4, B/5-6-7-8. Or even, A/1-2, B/3-4, C/5-6, D/7-8. Now every motion is duplicated twice, or four times, consecutively! This seems silly, intuitively they should be just laid out once each, right?

Or, you might want to do what I think is most intuitive in the A/1-2, B/3-4 structure: List the first motion, marked with the header "Round 1 and Round 2", and then the second, "Round 3 and Round 4", and so on. But this isn't compatible with Australs-style motion structures, let alone that crazy case I first put above.

The best answer to all this depends on the tournament. We could try to do a one-size-fits-all thing, but only by compromising on usability for everyone, either slightly or maybe even significantly.

Now, you might think, "But who cares? No tournament has such a whack motion-round structure." And I agree—that's my point. But the software isn't allowed to crash, and if someone does do it, you don't want it to misbehave, either. So if it's not a structure you anticipate actually dealing with, you shouldn't allow it in the first place, or make sure it'll do something sensible if someone does do it.

Ordinarily, I'm a big fan of making code generic, and I admire your tendency to do the same. But this comes with trade-offs, and in this case I worry that it might be an unnecessary Pandora's box. I don't even know how to conceptualize arbitrary motion-round connections properly—obviously you can just define it (as we have), but I don't know how it's supposed to work. If there's a known tournament structure that we want to cater for, we should have a system that caters for that structure. If there does happen to be a tournament that would run something like the above table, I'd be happy to consider how to work with it—after learning more about their tournament, in debating terms.

tienne-B commented 6 years ago

I would go with separation between rounds, but with the duplication of the motions. That is what I was getting at with the distinction between motion-round statistics (how many debates used it, and the split) and pure motion statistics (is it balanced?). That is the only way I can see that preserves both classes of statistics. (I would put a badge saying it is duplicated though). The problem was confounding both types of statistics; we want to know both how motions did in specific rounds and generally, but make it clear which type we're talking about where.

That is also the format that would minimize differences between the old model and the new (and it does minimize the changes required).

I hope I'm not just succumbing to the fallacy of lost costs; I think this would be a useful change though. (Also, the first time I tried out Tabbycat, non-repeatable motions was the first chagrin :) )

czlee commented 6 years ago

No, no, you're missing the point. Your preference is informed by the needs of your tournament. I'm saying that the presentation you're interested in depends on what the priorities of your competition are.

In terms of technical feasibility, you can draw the distinction between motion-per-round statistics and motion-overall statistics using any data structure that knows which motions affiliate with which rounds. So if you just had a(n optional) rule that said that every even preliminary round uses the same motion as the preceding preliminary odd round, you technically have enough information to compile all the statistics you want. Alternatively, you could allow any motion to be associated with any round, which is the approach I believe you're proposing.

You might think the latter is easier to implement. It's certainly more generic. But I'm saying that it also drastically increases the space of possible motion-to-round relationship structures that we have to cover, and that a one-size-fits-all solution is unlikely to fly, because different tournaments are interested in different ways of doing motions.

I don't know what the best solution is, at least not currently. But to persuade me that a system with arbitrary round-to-motion relationships is it, someone would have to persuade me that either that (1) we can design a system that is intuitive for any round-to-motion relationship structure I can dream of, that reduces to straightforward presentation with minimal special case code in cases like the true relationship between one-to-many, many-to-one or one-to-one; or (2) that there is a tournament that needs arbitrary round-to-motion relationships, not a well-defined, understandable structure in the rules of the tournament. Absent that, I'm currently learning towards implementing a setting specifically for the A/1-2 B/3-4 C/5-6 structure that you describe.

(When you say duplicates, do you mean literally present the exact same numbers twice, representing overall statistics for the motion, just because the motion was used twice? I presume not, right?)

tienne-B commented 6 years ago

Trying to convince you through point (1) that having repeated motions (not headers) in many rounds is intuitive (not just because that is the status quo right now) but because the main focus is on how many teams chose what and the split by round, which may change between rounds (say how it changes after everyone has debated it already). For (2), this allows the re-use of motions between tournaments (hosted on the same site), which may also be wanted, and that could only be done through arbitrary round-motion relationships.

You do make a good point about having settings, but I think having a neater structure for the settings to work with (this) would be preferable. Settings can make using this new structure which is mostly hidden more accessible.

No, I am not saying to present the same information twice, the number of people who chose a motion can change between rounds, as can the split.

czlee commented 6 years ago

If by (1) you mean the status quo statistics, it's not really clear to me what we gain. If the only real change is that tab directors needn't copy the motion manually from the last round, the simpler solution is to provide a button that copies the motion from the last round.

Is (2) something that is actually presently needed? If so, can you describe the needs, motivations, structure, etc. of the organization that uses it, in debating and logistics terms?

tienne-B commented 6 years ago

I think there is things to gain in terms of statistics with this change. I think there is a place for having both the motion-by-round and pure motion statistics; just I haven't wrote much of it yet, just the chi^2 test part. For this, I have shown that this change need not disrupt normal workflow, but that it opens up new possibilities. Also I don't like the idea of just copying a motion over as then we're repeating ourselves and the link between the same motion is broken. Normalization is possible here. A button could be implemented that inserts the relation without duplicating a record. This is a neater base for that!

But you are right that there are many possible ways to view the motion statistics, by round or by motion, and both are valid, but why not both? Both would paint a somewhat different view of the motion's "performance".

(2) Not that I know of. That's why I prefer discussion on (1) :)

czlee commented 6 years ago

Would you lay out in this thread what you intend to implement, then, please? Not how it works, but just what features and workflow you envision.

For example: Would we then have two motion statistics views, one aggregated by motion, and one with each round being treated separately? Do both these views continue to exist when there's a one-to-one relationship between round and motion? What about many-to-one, and one-to-many?

I'd also appreciate it if you could lay out the example I had in that table above.

tienne-B commented 6 years ago

The workflow is that when this new option is activated, the "copy" button is available on the draw page, the RoundMotions formset as a link in the Motions formset (so that inputting new motions is quick). The motion statistics will be by motion, and not round. There will always only be 1 statistics view

The new arrangement would only be active if the new setting is enabled, and, taking your table from above, the arrangement would be: (Sequence of motions may differ in both tables)

Motion Rounds
A 1
B 2
C Overall, 1, 3
D Overall, 1, 2
E 3

While if it's disabled, it would take it's current view:

Round Motions
1 A, C, D
2 B, D
3 C, E

However, in the round view, the chi-squared statistic will be repeated by motion as it should output the same conclusion regardless of round, but it would have a greater sample size like this.

Sound good? Making me plan ahead :)

czlee commented 6 years ago

To clarify: In the first table above (first by motion, then by round), are statistics also aggregated (as opposed to merely sorted) by motion, so that it shows just one set of statistics for all debates using that motion? That is:

By motion:  Motion A  (all rounds where used)
            Motion B  (all rounds where used)
            Motion C  (all rounds where used)
            Motion D  (all rounds where used)
            Motion E  (all rounds where used)

By round:   Round 1
                Motion A
                Motion C (just those debates in round 1)
                Motion D (just those debates in round 1)
            Round 2
                Motion B
                Motion D (just those debates in round 2)
            Round 3
                Motion C (just those debates in round 3)
                Motion E
tienne-B commented 6 years ago

Mostly correct. However by motion, it also shows motion-by-round stats, so motions C and D have many sections (overall and their rounds). By round, you are correct (however the chi-squared test will use the complete set of rounds; repeated with all instances of the motion)

czlee commented 3 years ago

celebration!