Open GeoffreyKhan opened 3 years ago
Thanks for clear report and link!
This is a trickiness about sort order of numerical things within longer alphanumeric strings. Technically these titles are presented in dictionary order (A1 comes before A4, even if it's actually "A10 (Kha..."). There are hacky things we can do to make it sort more logically, but nothing comprehensive as each would rely on a narrow understanding of how this example is structured so might bork on future, different title formats.
Perhaps the best solution is to take this metadata out of the title itself. You already have most of this set on the text itself (albeit not yet visible until #65 is approved). Could we remove this from the title of the text and instead include it in this table as separate columns, eg:
name | author | text id | dialect | transcription | translation |
---|---|---|---|---|---|
Is there a man with no worries? | Khan 2016 | A4 | Urmi, Christian | ✔ | ✔ |
Women do things best | Khan 2016 | A5 | Urmi, Christian | ✔ | ✔ |
... etc ... | |||||
A Visit from Harun ar-Rashid | Khan 2016 | A10 | Urmi, Christian | ✔ | ✔ |
This would make it much easier to sort the page by dialect
, then logically by text id
then alphabetically by title
. It would also make it possible to allow users to filter and sort the table for themselves in future when it gets bigger.
Let me know if this is an acceptable solution, and if you think recording_date
would be a useful extra metadata field to have,
This looks fine, but please note:
Please could you put this on staging
I haven't yet written the code for this part. Once a changeset is pushed I will apply the on staging
label and then you can test it. I'm not going to move any more work into staging until the existing lot is all marked ready for production
else we'll never actually get anything deployed!
The sort ordering on the Text ID field is not trivial. I may have to write a specific sorter for this field and before I do want to check I have the spec down:
A > A1 > A2 > A10 > A999 > AA > B > [not set]
Are there any cases where the text ID is not of the form [someletters][somenumbers] or just [someletters]?
Most texts now simply have a title in words, such as 'Bread and Cheese'. The titles that begin A1, A2 etc, are the numbering that appears when these texts are published. There will be B1, B2 as well. The ones without the A1, A2 etc are, in principle, unpublished. If it would help, we could add the same kind of numeration before the title of all texts. What do you prefer?
(Revisiting this as part of current milestone)
I think that the "A2" or "B1" code should not be part of the title string, and instead saved in the new text_id
field we previously created for this purpose. If we want to present them joined up we can output one field then the other: "{text_id} {title}".
Similarly, I think that strings like "(Khan 2016)" should not appear in the title as there's a source
field in which this reference is more commonly specified.
Finally, and just in case there's an easy win here, would it be acceptable to use text_id
s in the form "A01", "A02", ..., "B23"? This is commonly how systems avoid the ambiguous sort-order issues of inconsistent length numbers. (if more than 99 "A" texts are likely to eventually exist we should pad to three digits, eg "A001")
Very good proposals. I've implemented these changes in the Urmi, Christian texts and they look fine and are ordering correctly.
Great, that looks better.
As an extra tweak, how about we lose the source
column from the table and have the source somehow available as a hover element when available? Because the source text varies in length from nothing to the huge Khan, Geoffrey. The Neo-Aramaic Dialect of the Assyrian Christians of Urmi. 4 vols. Studies in Semitic Languages and Linguistics 86. Leiden-Boston: Brill, 2016, vol. 4
it kind of distorts the table. For variable-length/optional fields like this, particularly if you don't expect users to be scanning the page to find them, hiding it in the hover text of an ⓘ
symbol beside the text title might be neater.
OK, let's try that tweak.
In the list of uploaded audio files in https://nena.ames.cam.ac.uk/audio/, in A10 is ordered before A9 in the Urmi, Christian dialect, thus:
A10 A9 A8 etc
see
Can you make A10, A11 etc come after A9.?