Support alternative work titles

Nadeflore commented 7 years ago

Sometimes works are known by different names. Being able to search work by an alternative name would be a nice improvement

Benstales commented 6 years ago

I'll work on this issue once I will have more free time, like in two-three weeks. It's interesting since people often know the title of the show in their native language but not the original title (if it's, ofc, a foreign show). So it's very helpful when you are not familiar with the database of the karaoke host.

Benstales commented 6 years ago

Hi there, I began to work on this. I was wondering if this is still relevant to make the distinction between titles and alternative title. Star Wars is not much more official than its old French title "La Guerre des Etoiles" for example.

Just to clarify, I add another model "WorkTitle" containing only one field "title". There is a ManyToMany relationship between Work and WorkTitle since a work can have differently titles and a title can refer to several works (Star Wars refers to Star Wars I, Star Wars II, Star Wars III). I am wondering if, in this case, it would be more relevant to replace the "title" field by a "name" field which would be mainly used for display, and add WorkTitle "titles" field which will be used to make queries.

Question time : I would like to know if there is a rule to decide which model has the ownership of the ManyToMany. Atm I give the ownership to Work, but WorkTitle is not a bad either.

Neraste commented 6 years ago

Good question. I think it is not that bad to have one 'main' title and (if needed) several alternative titles. Keep in mind that if there were no distinction in the back-end, there should be one at least in the front-end (e.g.: which title to display first, if not which one to display at all in compact view). Having a clear main name is handy for manipulation. We tried the alternative-only approach in a former version of Dakara, using a Boolean field to indicate which title is the main one, and it was a true hell.

I think a simple OneToMany is enough for our needs. A ManyToMany is more complex (it needs one extra table to store the relations) and offers little more in comparison (we will never list works for a given alternative title). When querying a work with an alternative title (which is the intended use of this field), titles and alternative titles will be matched indifferently. There will be some redundancy between the alternative titles, but I do not think this is critical for a typical use.

I think the new model's name should be a bit more specialized: WorkAlternativeTitle. Work's title field should remain, the new field could be alternative_titles.

Benstales commented 6 years ago

Thanks for answering, I will work with that in mind. Since it is a OneToMany relationship it will still be coherent to dissociate the main title (a string) from the alternative titles (WorkAlternativeTitle), so I agree on the idea.

Since I have more free time, I am going to really put myself to work.

Benstales commented 6 years ago

Not sure about what you said about matching titles and alternatives titles indifferently. What I am about to do is editing the work query so that it includes now the alternative titles (in other words, edit views.py since there is nothing to add in the query language). Ofc I will edit the remaining query (views.py).

Benstales commented 6 years ago

That's quite troublesome. The queries are way too long and flake8 does not like it. WorkAltTitle and alt_title should shorten those queries, would this renaming be convenient ?

Benstales commented 6 years ago

A new branch has been created to adapt the feeder to take into account the alternative titles.

Benstales commented 6 years ago

I checked how feed.py was working. So if I understood correctly, it parses every single file in a folder specified by the user which should contain songs media files and subtitle files. From the name of those files we can extract data on those songs (with a custom parser), then we create the songs and the works, artists,... associated to these songs if they didn't exist already. So everything is fed upon the data we can extract from a "Song".

For the alternative titles, we cannot proceed the same way : it would create redundancy and we would have to update every single song name file associated to a work in order to include its alternative titles. You have to create a separate file gathering all the data about work alternative titles.

An idea would be to make a custom method like custom_parser to retrieve the data on the work alternative titles. This would let the user the choice on the file format (whether it is CSV, XML, JSON or something else...) containing such data. The only parameters required to this custom method would be the path of the file containing all the data related to work alternative titles, and the name of the work which we want to retrieve the alternative titles.

Neraste commented 6 years ago

Now you mention it, it seems clear that the feeder is not suited to handle works extra information. As you said, the aim of the feeder is to extract data from individual song files, so dealing with an extra works file is out of its scope. I think that we should instead create a new command dedicated to this task.

We could name it createworks and it would get or create works (based on their name?), then add extra info to them. The idea of a custom parser is appealing if the works are stored in an exotic way, but since a JSON dump seems pretty straightforward, it could be the default parser.

Benstales commented 6 years ago

That sounds fine to me. I would be glad to continue on developing this feature.

There is also one other part that I want to tackle that's puzzling me (and I promise it's the last one on my list). As it is we only have the choice to display the title in the "title" field of a work. But sometimes you would like to display another title. For example if you're japanese, you would like to display the japanese title instead of the english one. Maybe we should consider it after everything is ok with the alternative titles, but I want to keep in mind that could be a good feature if we want this software to be used by most people around the world. One way to solve this is to add a "title_class" field in the alternative title model and then choose which title to display upon this field. But, well, it is maybe for another enhancement.

Neraste commented 6 years ago

Indeed, for l10n, displaying the local title could be nice. Maybe a language attribute with a ISO 639-2 code can do the trick.

But for now, i18n and l10n have not been considered at all in the project. It is a challenging feature to implement and we will definitely do it, but it is not in our schedule. Moreover, we have seen during the development of Dakara that pre-implementing future features leads to unnecessary complicated code. Let us focus on the core functionality of the alternative titles first, we will be able to improve it in the future, one feature at a time.

Neraste commented 6 years ago

I thought that there were some i18n and l10n notes, but I cannot find them. I have created a new project for this matter on front side and I have added the idea of local work names.

Benstales commented 6 years ago

Hi again,

Related to how the admin feed the database with the "scan method" (by parsing the names of a file of a directory), it seems to be difficult to update the database with the alternative titles because no work file is generated at the end of the scan (createworks takes a work file as required argument).

What may be useful is a dumpworks command that produces from a Dakara database a json work file to recreate this work database.

Nadeflore commented 6 years ago

You mean, in order to gather alternative titles from work already in database, we need a way to export the list of works from the database ?

This could work, but I think we could use another approach, like instead of using a django command, we could make a script which make calls to the api, to retriever the works list and update the works with alternative titles

Neraste commented 6 years ago

I am not sure to get this right. The first call of feed/future-scan does not add alternative titles, or does it? This command creates works on demand. So, calling createalternativetitles after will only update the already created works. If you call this commandfirst, nothing is done. This not problematic for me.

Also, I was told that Django can export/import the database at least in JSON by itself.

Le 29 septembre 2018 01:15:01 GMT+09:00, Nadeflore notifications@github.com a écrit :

You mean, in order to gather alternative titles from work already in database, we need a way to export the list of works from the database ?

This could work, but I think we could use another approach, like instead of using a django command, we could make a script which make calls to the api, to retriever the works list and update the works with alternative titles

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/DakaraProject/dakara-server/issues/31#issuecomment-425487605

-- Neraste

Benstales commented 6 years ago

@Nadeflore Yes I might not have been clear enough, the idea is that it is inappropriate to add the alternative titles with feed/scan method because of the reasons mentioned above, so we need to feed the database with something I call "work file" which gathers all the information about a worktype, its related works and their data (subtitle, alternative titles). To add the alternative titles with the current feed command, what I think is the natural way to do is:

execute the feed/scan command to create works, songs, ...
retrieve a work file such that if given to createworks command it is creating exactly the same works with the same data as the ones in the current database (or said simplier, a work file that describes the works in the current database)
add to this work file the alternative titles for each work
execute createworks command to update the database with the added information of the work file.

@Neraste I checked the dumpdata command of the django and it is working fine. So yeah, there might be no need for a dumpdataworks command.

Neraste commented 3 years ago

Was implemented long ago.

DakaraProject / dakara-server

Support alternative work titles #31