giggls / mapnik-german-l10n

OSM map l10n functions
Other
28 stars 40 forks source link

More french abbreviations - How to proceed? #29

Closed chatelao closed 5 years ago

chatelao commented 5 years ago

I need your opinion:

Gretchen Question:

--
-- Name: fr_abbrev(text); Type: FUNCTION; Schema: public; Owner: postgres
--
CREATE OR REPLACE FUNCTION fr_abbrev(longtext) RETURNS TEXT AS $$
 DECLARE
  abbrev text;
 BEGIN
    abbrev = longtext;
    abbrev = replace($abbrev,'lémentaire ','lem. ');
    abbrev = replace($abbrev, 'econdaire ','econd. ');
    abbrev = replace($abbrev, 'rimaire ','rim. ');
    abbrev = replace($abbrev, 'aternelle ','at. ');
    abbrev = replace($abbrev, 'ommerciale ','omm. ');
    abbrev = replace($abbrev, 'Direction ','Dir. ');
    abbrev = replace($abbrev, 'Chapelle ','Chap. ');
    abbrev = replace($abbrev, 'Cathédrale ','Cath. ');
    abbrev = replace($abbrev, ' Notre-Dame ',' N.D. ');

    abbrev = replace($abbrev, 'Avenue ','Av. ');
    abbrev = replace($abbrev, 'Boulevard ','Bd. ');
    abbrev = replace($abbrev, 'Esplanade ','Espl. ');
    abbrev = replace($abbrev, 'Faubourg ','Fbg. ');
    abbrev = replace($abbrev, 'Passage ','Pass. ');
    abbrev = replace($abbrev, 'Place ','Pl. ');
    abbrev = replace($abbrev, 'Promenade ','Prom. ');
    abbrev = replace($abbrev, 'Impasse ','Imp. ');

    abbrev = replace($abbrev, 'Square ','Sq. ');

    abbrev = replace($abbrev, 'Centre Commercial ','CCial. ');
    abbrev = replace($abbrev, 'Immeuble ','Imm. ');
    abbrev = replace($abbrev, 'Lotissement ','Lot. ');
    abbrev = replace($abbrev, 'Résidence ','Rés. ');
    abbrev = replace($abbrev, 'Zone Industrielle ','ZI. ');
    abbrev = replace($abbrev, 'Adjudant ','Adj. ');
    abbrev = replace($abbrev, 'Agricole ','Agric. ');
    abbrev = replace($abbrev, 'Arrondissement','Arrond.');
    abbrev = replace($abbrev, 'Aspirant ','Asp. ');
    abbrev = replace($abbrev, 'Colonel ','Col. ');
    abbrev = replace($abbrev, 'Commandant ','Cdt. ');
    abbrev = replace($abbrev, 'Commercial ','Cial. ');
    abbrev = replace($abbrev, 'Coopérative ','Coop. ');
    abbrev = replace($abbrev, 'Division ','Div. ');
    abbrev = replace($abbrev, 'Docteur ','Dr. ');
    abbrev = replace($abbrev, 'Général ','Gén. ');
    abbrev = replace($abbrev, 'Institut ','Inst. ');
    abbrev = replace($abbrev, 'Faculté ','Fac. ');
    abbrev = replace($abbrev, 'Laboratoire ','Labo. ');
    abbrev = replace($abbrev, 'Lieutenant ','Lt. ');
    abbrev = replace($abbrev, 'Maréchal ','Mal. ');
    abbrev = replace($abbrev, 'Ministère ','Min. ');
    abbrev = replace($abbrev, 'Monseigneur ','Mgr. ');
    abbrev = replace($abbrev, 'Médiathèque ','Médiat. ');
    abbrev = replace($abbrev, 'Bibliothèque ','Bibl. ');
    abbrev = replace($abbrev, 'Tribunal ','Trib. ');
    abbrev = replace($abbrev, 'Observatoire ','Obs. ');
    abbrev = replace($abbrev, 'Périphérique ','Périph. ');
    abbrev = replace($abbrev, 'Préfecture ','Préf. ');
    abbrev = replace($abbrev, 'Président ','Pdt. ');
    abbrev = replace($abbrev, 'Régiment ','Rgt. ');
    abbrev = replace($abbrev, 'Saint-','Sᵗ-');
    abbrev = replace($abbrev, 'Sainte-','Sᵗᵉ-');
    abbrev = replace($abbrev, 'Sergent ','Sgt. ');
    abbrev = replace($abbrev, 'Université ','Univ. ');

    abbrev = regexp_replace($abbrev, 'Communauté d.[Aa]gglomération','Comm. d''agglo. ');
    abbrev = regexp_replace($abbrev, 'Communauté [Uu]rbaine ','Comm. urb. ');
    abbrev = regexp_replace($abbrev, 'Communauté de [Cc]ommunes ','Comm. comm. ');
    abbrev = regexp_replace($abbrev, 'Syndicat d.[Aa]gglomération ','Synd. d''agglo. ');
    abbrev = regexp_replace($abbrev, '^Chemin ','Ch. ');
    abbrev = regexp_replace($abbrev, '^Institut ','Inst. ');
    abbrev = regexp_replace($abbrev, 'Zone d.[Aa]ctivité.? [Éeée]conommique.? ','Z.A.E. ');
    abbrev = regexp_replace($abbrev, 'Zone d.[Aa]ctivité.? ','Z.A. ');
    abbrev = regexp_replace($abbrev, 'Zone [Aa]rtisanale ','Zone Art. ');
    abbrev = regexp_replace($abbrev, 'Zone [Ii]ndustrielle ','Z.I. ');
    abbrev = regexp_replace($abbrev, ' [Pp]ubli(c|que) ',' Publ. ');
    abbrev = regexp_replace($abbrev, ' [Pp]rofessionnel(|le) ',' Prof. ');
    abbrev = regexp_replace($abbrev, ' [Tt]echnologique ',' Techno. ');
    abbrev = regexp_replace($abbrev, ' [Pp]olyvalent ',' Polyv. ');
    abbrev = regexp_replace($abbrev, '[EÉeé]tablissement(|s) ','Éts. ');
    abbrev = regexp_replace($abbrev, ' [Mm]unicipal(|e) ',' Munic. ');
    abbrev = regexp_replace($abbrev, ' [Dd]épartemental(|e) ',' Départ. ');
    abbrev = regexp_replace($abbrev, ' [Ii]ntercommunal(|le) ',' Interco. ');
    abbrev = regexp_replace($abbrev, ' [Rr]égional(|e) ',' Région. ');
    abbrev = regexp_replace($abbrev, ' [Ii]nterdépartemental(|e) ',' Interdép. ');
    abbrev = regexp_replace($abbrev, ' [Hh]ospitali(er|ère) ',' Hospit. ');
    abbrev = regexp_replace($abbrev, ' [EÉeé]lectrique ',' Élect. ');
    abbrev = regexp_replace($abbrev, ' [Ss]upérieur(|e) ',' Sup. ');
    abbrev = regexp_replace($abbrev, '^[Bb][aâ]timent ','Bât. ');
    abbrev = regexp_replace($abbrev, '[Aa]éronautique ','Aéron. ');

    return abbrev;

$$ LANGUAGE 'plpgsql' IMMUTABLE;
dieterdreist commented 5 years ago

From an Italian perspective,

'Commercial ' may work (because typical use is "Commerciale" with a trailing e, but it would interfere with English "commercial" and I guess 'Cial. ' is not a suitable abbreviation in English

'ommerciale ' on the other hand, will interfere with Italian "commerciale" and "Commerciale" but "omm." could be seen as a suitable abbreviation also in Italian.

'[Ii]ntercommunal(|le) ',' Interco. ' will also potentially interfere with Italian (intercommunale), and interco. doesn't seem a suitable abbreviation (IMHO, but I am not a native speaker)

' [Mm]unicipal(|e) ',' Munic. ') is also common in Italian, e.g. "Polizia Municipale " and "Munic." doesn't seem suitable (it is not common, could be understood with some phantasy but isn't what you'd expect)

'^Institut ','Inst. ' is common in German, but the abbreviation could work (isn't probably desirable, provided there would be sufficient space to render the unabbreviated term).

'Division ','Div. ' is used in many languages (e.g. German and English), but the abbreviation might work there as well

"Square" doesn't seem to be French, has it been overlooked?

giggls commented 5 years ago

Well, producing Abbreviations is a side feature of osml10n at best!

Actually it is used in only one particular case: In generation of street-names.

This is because bilingual street-names tend to get very long.

This code does not do abbreviations on other names than street-names at all and I will certainly keep it this way. Doing country-dependent abbreviation calls would be possible but this would be way to expensive for a side-feature like this.

French is not such a big problem anyway as bilingual street-names will appear mostly in areas where non-latin script is used.

So feel free to add missing street-abbreviations from this code, but forget about the rest.

As a hint for the future: Please do not try to implement further strange features like this out of the blue. There are areas where real work can be done!

Just have a look at the rendered map. This is what this code is used for after all.

If you see places where names are rendered in a suboptimal way, come back here and try to change the code in a way that will produce a better looking map.

Did you actually watch my talks? If not, please do it because I am talking about some of the real problems there. And do not forget, that this code will never be a 100% perfect solution for any placename in the world.

chatelao commented 5 years ago

Hi Sven, help me shorty out: Whicht talk do you mean exactly (i.e.: Where could I find them)? I'd love to contribute on "real" (effective) work in the endless ocean of OSM.

giggls commented 5 years ago

Huh? One of my talks about this particular Software linked in the Wiki of this particular Software.

chatelao commented 5 years ago

Ups, I just read the readme and didn' t notice the Wiki 🙈