trovu / trovu-data

Shortcuts & mapping data for types.
The Unlicense
1 stars 3 forks source link

Organize dictionary / clashing shortcuts into their own namespaces #10

Closed georgjaehnig closed 1 year ago

georgjaehnig commented 2 years ago

A current problem around dictionaries are shortcuts for the same language pair, for different dictionaries, for instance

Distinguishing by one letter (d, p, b) has its limits, as multiple dictionaries can start with the same letter, obviously.

An idea to solve this:

That way, the rest of the shortcut data is there only once, so DRY

Then, users may set their own aliases in their trovu-data-user repo.

(Potential) problems

neubarth commented 1 year ago

At the beginning, I was sceptical to start using namespaces for specific sites like Pons or dict.cc, because I thought they should be reserved for global concepts, like languages and countries. But then I thought that ICANN's top level domains have taken the same path recently, so I guess that's fine.

Also, I thought what seems most intuitive for me from a user perspective. These are the results:

  1. de.en tree, pons
  2. pons tree, de-en I assume both options are not viable, because the arguments could not be used as simple query parameters, i.e. some additional processing would need to be introduced.

Therefore, I think we should go with the namespace-approach, assuming that the number of namespaces can be scaled up easily.

parsing of aliases beyond namespaces could be challenging

Can you elaborate a little bit more on that? Thanks :)

calling shortcuts directly will be tedious: dict-cc.de-en tree instead of end tree

I think dict-cc.de-en tree is as concise and readable as it can get. And it should only be required when the default dictionary is not desired. When a user finds that he needs this structure often, he can either memorize the end tree version, or create a custom shortcut for himself.

m4h4 commented 1 year ago

The namespace idea for dictionary families sounds good to me (although I suggest to use short strings, such as:pons, dict, beo). There should be a default dictionary in the language namespace, so if I don't care about a Finnish dictionary, I can just type fi and will get Finnish to German in the German namespace. If I want pons explicitly, I can type pons.de-fi, and if I always prefer this, I can define my own name.

georgjaehnig commented 1 year ago

@neubarth:

parsing of aliases beyond namespaces could be challenging

Can you elaborate a little bit more on that? Thanks :)

Sure. It's about this ticket. In fact, it was renamed in the meantime from alias: to include. With the current concept, such things will be possible:

So there's one spot with the dictionary URL and possibly tags & examples, and this get's included from the language namespaces de and es. Because we include only some parts, we can set custom titles in each of the language namespaces.

And as I expected: Writing the code to include accross namespaces is more challenging than I thought. :) But still possible, and it causes a lot of good cleanup.

I think dict-cc.de-en tree is as concise and readable as it can get

Well, more concise would be dict-cc.en tree from de namespace (and respectively, dict-cc.de tree from en namespace). But I just realize, this could be achived by including them:

he can either memorize the end tree version, or create a custom shortcut for himself.

I think I want to drop the end keywords, as they are not intuitive (see also: enl for Leo and then eni for Linguee, as the l was already in use.) and cluttered the namespace.

The longer versions a la en-dict are better memorable and still somewhat short enough.

And yes, there's always the option to set your own keyword within your user shortcuts (even soon more easier with the new include:!)

@m4h4: Thank you also for your opinion.

georgjaehnig commented 1 year ago

Fixed and reorganized with 35