Closed nbkhwjm closed 10 months ago
Update:
i was able to get a VERY SMALL extract to import and not fail, however its not translating... not trying to stack issues here, :-) just wanna give a full picture of whats going on.
confirmed install with "make tests"
`root@pb5:/osml10n# make test cd lua_osml10/tests/ && ./runtests.lua calling osml10n.unaccent("Besançon"): [OK] (expected Besancon, got Besancon) calling osml10n.unaccent("München"): [OK] (expected Munchen, got Munchen) calling osml10n.unaccent("Brüssel"): [OK] (expected Brussel, got Brussel)
calling osml10n.is_latin("Eigenheimstraße"): [OK] (expected true, got true) calling osml10n.is_latin("улица Воздвиженка"): [OK] (expected false, got false)
calling osml10n.contains_cjk("Eigenheimstraße"): [OK] (expected false, got false) calling osml10n.contains_cjk("100 漢字"): [OK] (expected true, got true)
calling osml10n.contains_cyrillic("Eigenheimstraße"): [OK] (expected false, got false) calling osml10n.contains_cyrillic("улица Воздвиженка"): [OK] (expected true, got true)
calling osml10n.list2string({ "Indien", "भारत", "India" }, "|"): [OK] (expected Indien|भारत|India, got Indien|भारत|India)
calling osml10n.get_country_name({ ["name:de"] = "Indien", ["ISO3166-1:alpha2"] = "IN", ["name:en"] = "India", ["name:hi"] = "भारत" }, "de"): [OK] (expected { "Indien", "भारत", "India" }, got { "Indien", "भारत", "India" }) calling osml10n.get_country_name({ ["name:de"] = "Indien", ["ISO3166-1:alpha2"] = "IN", ["name:en"] = "India", ["name:hi"] = "भारत" }, "de", true): [OK] (expected { "भारत", "India", "Indien" }, got { "भारत", "India", "Indien" }) calling osml10n.get_country_name({ ["name:de"] = "Indien", ["ISO3166-1:alpha2"] = "IN", ["name:en"] = "India", ["name:hi"] = "भारत" }, "en"): [OK] (expected { "India", "भारत" }, got { "India", "भारत" })
calling osml10n.street_abbrev("Doktor-No-Straße", "de"): [OK] (expected Dr.-No-Str., got Dr.-No-Str.) calling osml10n.street_abbrev("Schillerstraße", "de"): [OK] (expected Schillerstr., got Schillerstr.) calling osml10n.street_abbrev("Kronenplatz", "de"): [OK] (expected Kronenpl., got Kronenpl.) calling osml10n.street_abbrev("Gottesauer Platz", "de"): [OK] (expected Gottesauer Pl., got Gottesauer Pl.)
calling osml10n.street_abbrev("Mulholland Drive", "en"): [OK] (expected Mulholland Dr., got Mulholland Dr.) calling osml10n.street_abbrev("92 Avenue NW", "en"): [OK] (expected 92 Ave. NW, got 92 Ave. NW) calling osml10n.street_abbrev("1st Avenue", "en"): [OK] (expected 1st Ave., got 1st Ave.) calling osml10n.street_abbrev("2nd Avenue", "en"): [OK] (expected 2nd Ave., got 2nd Ave.) calling osml10n.street_abbrev("5th Avenue", "en"): [OK] (expected 5th Ave., got 5th Ave.) calling osml10n.street_abbrev("William S Canning Boulevard", "en"): [OK] (expected William S Canning Blvd., got William S Canning Blvd.) calling osml10n.street_abbrev("Main Road", "en"): [OK] (expected Main Rd., got Main Rd.) calling osml10n.street_abbrev("Sabin Place", "en"): [OK] (expected Sabin Pl., got Sabin Pl.) calling osml10n.street_abbrev("Trafalgar Square", "en"): [OK] (expected Trafalgar Sq., got Trafalgar Sq.) calling osml10n.street_abbrev("Oregon Expressway", "en"): [OK] (expected Oregon Expy, got Oregon Expy) calling osml10n.street_abbrev("Juniperro Serra Freeway", "en"): [OK] (expected Juniperro Serra Fwy, got Juniperro Serra Fwy) calling osml10n.street_abbrev("Curtiss Parkway", "en"): [OK] (expected Curtiss Pkwy, got Curtiss Pkwy) calling osml10n.street_abbrev("Parkway Drive", "en"): [OK] (expected Parkway Dr., got Parkway Dr.) calling osml10n.street_abbrev("North 50th Street", "en"): [OK] (expected N 50th St., got N 50th St.) calling osml10n.street_abbrev("Carrol Street Southeast", "en"): [OK] (expected Carrol St. SE, got Carrol St. SE)
calling osml10n.street_abbrev("Avenue de la Gare", "fr"): [OK] (expected Av. de la Gare, got Av. de la Gare) calling osml10n.street_abbrev("1re Avenue du Domaine-Patry", "fr"): [OK] (expected 1re Av. du Domaine-Patry, got 1re Av. du Domaine-Patry) calling osml10n.street_abbrev("1e Avenue de la Montée-Gordon", "fr"): [OK] (expected 1re Av. de la Montée-Gordon, got 1re Av. de la Montée-Gordon) calling osml10n.street_abbrev("2e Avenue du Lac-des-Pins", "fr"): [OK] (expected 2e Av. du Lac-des-Pins, got 2e Av. du Lac-des-Pins) calling osml10n.street_abbrev("201e Avenue", "fr"): [OK] (expected 201e Av., got 201e Av.) calling osml10n.street_abbrev("Boulevard de Pérolles", "fr"): [OK] (expected Bd de Pérolles, got Bd de Pérolles) calling osml10n.street_abbrev("Chemin des bains", "fr"): [OK] (expected Ch. des bains, got Ch. des bains) calling osml10n.street_abbrev("Esplanade de lancienne gare", "fr"): [OK] (expected Espl. de lancienne gare, got Espl. de lancienne gare) calling osml10n.street_abbrev("Impasse de la forêt", "fr"): [OK] (expected Imp. de la forêt, got Imp. de la forêt) calling osml10n.street_abbrev("Passage de Cardinal", "fr"): [OK] (expected Pass. de Cardinal, got Pass. de Cardinal) calling osml10n.street_abbrev("Promenade du Barrage", "fr"): [OK] (expected Prom. du Barrage, got Prom. du Barrage) calling osml10n.street_abbrev("Ruelle des Tonneliers", "fr"): [OK] (expected Rle des Tonneliers, got Rle des Tonneliers) calling osml10n.street_abbrev("Route de Marly", "fr"): [OK] (expected Rte de Marly, got Rte de Marly) calling osml10n.street_abbrev("Sentier du Stand", "fr"): [OK] (expected Sent. du Stand, got Sent. du Stand)
calling osml10n.geo_transcript("42", "東京", { 138.79, 36.08, 139.51, 36.77 }): [OK] (expected Toukyou, got Toukyou) calling osml10n.geo_transcript("42", "漢字 100 abc", { 138.79, 36.08, 139.51, 36.77 }): [OK] (expected Kanji 100 abc, got Kanji 100 abc) calling osml10n.geo_transcript("42", "東京", { 113.05, 29.45, 115.73, 32.13 }): [OK] (expected dōng jīng, got dōng jīng) calling osml10n.geo_transcript("42", "漢字 100 abc", { 113.05, 29.45, 115.73, 32.13 }): [OK] (expected hàn zì 100 abc, got hàn zì 100 abc) calling osml10n.geo_transcript("42", "北京", { -30, 49, -29, 50 }): [OK] (expected běi jīng, got běi jīng) calling osml10n.geo_transcript("42", "ห้องสมุดประชาชน", { 100, 14, 101, 15 }): [OK] (expected hongsamut prachachon, got hongsamut prachachon) calling osml10n.geo_transcript("42", "thai ถนนข้าวสาร 100", { 100, 14, 101, 15 }): [OK] (expected thai thanon khaosan 100, got thai thanon khaosan 100) calling osml10n.geo_transcript("42", "อนุสาวรีย์พระยารัษฎาณุประดิษฐ์", { 100, 14, 101, 15 }): [OK] (expected anusawari phraya ratsa da nu pradit, got anusawari phraya ratsa da nu pradit) calling osml10n.geo_transcript("42", "香港", { 113.54, 22.16, 113.58, 22.2 }): [OK] (expected hōeng góng, got hōeng góng) calling osml10n.geo_transcript("42", "香港", { 114.15, 22.28, 114.2, 22.33 }): [OK] (expected hōeng góng, got hōeng góng) calling osml10n.geo_transcript("42", "Москва́"): [OK] (expected Moskvá, got Moskvá) calling osml10n.geo_transcript("42", "Москва́", { -30, 49, -29, 50 }): [OK] (expected Moskvá, got Moskvá) calling osml10n.geo_transcript("42", "some/name", { 114.15, 22.28, 114.2, 22.33 }): [OK] (expected some/name, got some/name) calling osml10n.geo_transcript("42", "some/name"): [OK] (expected some/name, got some/name)
calling osml10n.get_placename_from_tags("", { ["name"] = "Москва́", ["name:de"] = "Moskau", ["name:en"] = "Moscow" }, true, " - ", "de"): [OK] (expected Москва́ - Moskau, got Москва́ - Moskau) calling osml10n.get_placename_from_tags("", { ["name"] = "Москва́", ["name:de"] = "Moskau", ["name:en"] = "Moscow" }, false, "|", "de"): [OK] (expected Moskau|Москва́, got Moskau|Москва́) calling osml10n.get_placename_from_tags("", { ["name"] = "London", ["name:de"] = "London", ["name:en"] = "London" }, false, "|", "de"): [OK] (expected London, got London) calling osml10n.get_placename_from_tags("", { ["name"] = "القاهرة", ["name:de"] = "Kairo", ["int_name"] = "Cairo", ["name:en"] = "Cairo" }, false, "|"): [OK] (expected Cairo|القاهرة, got Cairo|القاهرة) calling osml10n.get_placename_from_tags("", { ["name"] = "Bruxelles - Brussel", ["name:fr"] = "Bruxelles", ["name:af"] = "Brussel", ["name:fo"] = "Brussel", ["name:de"] = "Brüssel", ["name:en"] = "Brussels", ["name:xx"] = "Brussel" }, false, "|", "de"): [OK] (expected Brüssel|Bruxelles, got Brüssel|Bruxelles) calling osml10n.get_placename_from_tags("", { ["name"] = "Brixen - Bressanone", ["name:de"] = "Brixen", ["name:it"] = "Bressanone" }, false, "|", "de"): [OK] (expected Brixen|Bressanone, got Brixen|Bressanone) calling osml10n.get_placename_from_tags("", { ["name"] = "Brixen - Bressanone", ["name:de"] = "Brixen" }, false, "|", "de"): [OK] (expected Brixen, got Brixen) calling osml10n.get_placename_from_tags("", { ["name"] = "Merano - Meran", ["name:de"] = "Meran", ["name:it"] = "Merano" }, true, "|", "de"): [OK] (expected Merano|Meran, got Merano|Meran) calling osml10n.get_placename_from_tags("", { ["name"] = "Meran - Merano", ["name:de"] = "Meran", ["name:it"] = "Merano" }, true, "|", "de"): [OK] (expected Meran|Merano, got Meran|Merano) calling osml10n.get_placename_from_tags("", { ["name"] = "Roma", ["name:de"] = "Rom" }, false, "|", "de"): [OK] (expected Rom|Roma, got Rom|Roma) calling osml10n.get_streetname_from_tags("", { ["name"] = "Dr. No Street", ["name:de"] = "Professor-Doktor-No-Straße" }, false, " - ", "de"): [OK] (expected Prof.-Dr.-No-Str. - Dr. No St., got Prof.-Dr.-No-Str. - Dr. No St.) calling osml10n.get_placename_from_tags("", { ["name"] = "Dr. No Street", ["name:de"] = "Doktor-No-Straße" }, false, " - ", "de"): [OK] (expected Doktor-No-Straße - Dr. No Street, got Doktor-No-Straße - Dr. No Street) calling osml10n.get_placename_from_tags("", { ["name:de"] = "Doktor-No-Straße" }, false, " - ", "de"): [OK] (expected Doktor-No-Straße, got Doktor-No-Straße) calling osml10n.get_streetname_from_tags("", { ["name:de"] = "Doktor-No-Straße" }, false, " - ", "de"): [OK] (expected Dr.-No-Str., got Dr.-No-Str.) calling osml10n.get_localized_name_from_tags("", { ["name"] = "Dr. No Street", ["name:de"] = "Doktor-No-Straße" }, "de"): [OK] (expected Doktor-No-Straße, got Doktor-No-Straße) calling osml10n.get_localized_name_from_tags("", { ["name:de"] = "Doktor-No-Straße" }, "de"): [OK] (expected Doktor-No-Straße, got Doktor-No-Straße) calling osml10n.get_localized_name_from_tags("", { ["name"] = "北京" }, "de"): [OK] (expected běi jīng, got běi jīng) calling osml10n.get_localized_name_from_tags("", { ["name"] = "北京" }, "de", { 138.79, 36.08, 139.51, 36.77 }): [OK] (expected Pekin, got Pekin) calling osml10n.get_streetname_from_tags(" - ", { ["name"] = "улица Воздвиженка", ["name:en"] = "Vozdvizhenka Street" }, true, " - ", "de"): [OK] (expected ул. Воздвиженка - Vozdvizhenka St., got ул. Воздвиженка - Vozdvizhenka St.) calling osml10n.get_streetname_from_tags("", { ["name"] = "улица Воздвиженка" }, true, " - ", "de"): [OK] (expected ул. Воздвиженка - ul. Vozdviženka, got ул. Воздвиженка - ul. Vozdviženka) calling osml10n.get_streetname_from_tags("", { ["name"] = "вулиця Молока" }, true, " - ", "de"): [OK] (expected вул. Молока - vul. Moloka, got вул. Молока - vul. Moloka) calling osml10n.get_placename_from_tags("", { ["name"] = "주촌 Juchon", ["name:ko_rm"] = "Juchon", ["name:ko"] = "주촌" }, true, "|"): [OK] (expected 주촌|Juchon, got 주촌|Juchon) calling osml10n.get_placename_from_tags("", { ["name"] = "주촌", ["name:ko_rm"] = "Juchon", ["name:ko"] = "주촌" }, false, "|"): [OK] (expected Juchon|주촌, got Juchon|주촌) calling osml10n.get_streetname_from_tags("", { ["name"] = "ဘုရားကိုင်လမ်း Pha Yar Kai Road", ["name:my"] = "ဘုရားကိုင်လမ်း", ["name:en"] = "Pha Yar Kai Road", ["highway"] = "secondary" }, true, "|"): [OK] (expected ဘုရားကိုင်လမ်း|Pha Yar Kai Rd., got ဘုရားကိုင်လမ်း|Pha Yar Kai Rd.) calling osml10n.get_streetname_from_tags("", { ["name"] = "ဘုရားကိုင်လမ်း", ["name:my"] = "ဘုရားကိုင်လမ်း", ["name:en"] = "Pha Yar Kai Road", ["highway"] = "secondary" }, true, "|"): [OK] (expected ဘုရားကိုင်လမ်း|Pha Yar Kai Rd., got ဘုရားကိုင်လမ်း|Pha Yar Kai Rd.) calling osml10n.get_streetname_from_tags("", { ["name"] = "鳳凰徑第3段 Lantau Trail Section 3", ["name:zh"] = "鳳凰徑第3段", ["name:en"] = "Lantau Trail Section 3", ["name:yue"] = "鳳凰徑" }, true, "|"): [OK] (expected 鳳凰徑第3段|Lantau Trail Section 3, got 鳳凰徑第3段|Lantau Trail Section 3) calling osml10n.get_placename_from_tags("", { ["name"] = "Bouira البويرة ⵝⵓⵠⵉⵔⴻⵜ", ["name:de"] = "Bouira", ["name:ber"] = "ⵝⵓⵠⵉⵔⴻⵜ", ["name:ar"] = "البويرة" }, false, "|", "de"): [OK] (expected Bouira|البويرة|ⵝⵓⵠⵉⵔⴻⵜ, got Bouira|البويرة|ⵝⵓⵠⵉⵔⴻⵜ) 87 tests passed, 0 tests failed.`
osm2pgsql Command
postgres@pb5:/osm2/osm2pgsql/build$ ./osm2pgsql -G -O flex -d gis -S /osml10n/openstreetmap-carto-hstore-only-l10n.lua libya-latest.osm.pbf
2023-12-26 23:47:19 osm2pgsql version 1.10.0 (1.10.0-9-g4a8f42b5)
2023-12-26 23:47:19 Database version: 16.1 (Ubuntu 16.1-1.pgdg22.04+1)
2023-12-26 23:47:19 PostGIS version: 3.4
2023-12-26 23:47:19 Storing properties to table '"public"."osm2pgsql_properties"'.
2023-12-26 23:47:19 WARNING: The 'area' column type is deprecated. Please read
2023-12-26 23:47:19 WARNING: https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/
2023-12-26 23:47:20 WARNING: The add_row() function is deprecated. Please read
2023-12-26 23:47:20 WARNING: https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/
2023-12-26 23:49:53 Reading input files done in 153s (2m 33s).
2023-12-26 23:49:53 Processed 10191503 nodes in 13s - 784k/s
2023-12-26 23:49:53 Processed 1362288 ways in 138s (2m 18s) - 10k/s
2023-12-26 23:49:53 Processed 1299 relations in 2s - 650/s
2023-12-26 23:49:54 No marked ways (Skipping stage 2).
2023-12-26 23:49:54 Clustering table 'planet_osm_hstore_point' by geometry...
2023-12-26 23:49:54 Clustering table 'planet_osm_hstore_polygon' by geometry...
2023-12-26 23:49:54 Clustering table 'planet_osm_hstore_roads' by geometry...
2023-12-26 23:49:54 Clustering table 'planet_osm_hstore_line' by geometry...
2023-12-26 23:49:54 Creating index on table 'planet_osm_hstore_roads' ("way")...
2023-12-26 23:49:54 Analyzing table 'planet_osm_hstore_roads'...
2023-12-26 23:49:54 No indexes to create on table 'planet_osm_hstore_route'.
2023-12-26 23:49:54 Analyzing table 'planet_osm_hstore_route'...
2023-12-26 23:49:54 Creating index on table 'planet_osm_hstore_point' ("way")...
2023-12-26 23:49:55 Analyzing table 'planet_osm_hstore_point'...
2023-12-26 23:49:55 All postprocessing on table 'planet_osm_hstore_point' done in 1s.
2023-12-26 23:49:55 Creating index on table 'planet_osm_hstore_line' ("way")...
2023-12-26 23:49:56 Analyzing table 'planet_osm_hstore_line'...
2023-12-26 23:49:56 All postprocessing on table 'planet_osm_hstore_line' done in 2s.
2023-12-26 23:49:56 All postprocessing on table 'planet_osm_hstore_roads' done in 0s.
2023-12-26 23:49:57 Creating index on table 'planet_osm_hstore_polygon' ("way")...
2023-12-26 23:49:59 Analyzing table 'planet_osm_hstore_polygon'...
2023-12-26 23:49:59 All postprocessing on table 'planet_osm_hstore_polygon' done in 5s.
2023-12-26 23:49:59 All postprocessing on table 'planet_osm_hstore_route' done in 0s.
2023-12-26 23:49:59 Storing properties to table '"public"."osm2pgsql_properties"'.
2023-12-26 23:49:59 osm2pgsql took 160s (2m 40s) overall.
i downgraded osm2pgsql to 1.7.0 and it fails in the same way..
postgres@pb5:/osm_1-7-0/osm2pgsql-1.7.0/build$ ./osm2pgsql -G -O flex -d gis --number-processes 40 -S /osml10n/openstreetmap-carto-hstore-only-l10n.lua /osm2/osm2pgsql/build/japan-latest.osm.pbf
2023-12-27 00:30:57 osm2pgsql version 1.7.0
2023-12-27 00:30:57 WARNING: --number-processes too large. Set to 32.
2023-12-27 00:30:57 Database version: 16.1 (Ubuntu 16.1-1.pgdg22.04+1)
2023-12-27 00:30:57 PostGIS version: 3.4
Processing: Node(246860k 219.4k/s) Way(0k 0.00k/s) Relation(0 0.0/s)free(): invalid size
Aborted (core dumped)
I was able to successfully import with this change, ok to close this...
OK thus not an issue of osml10n but one of lrexlib-pcre which I want to get rid of in the next release anyway.
Problem gone in versions > 1.1
SUMMARY Im trying to get an all English titles map. This is a newly built system, that has imported the whole planet quite easily with a standard command: it resulted in a fully functional map, with non english titles on asia / north africa etc.. totally normal. I tried using Localization functions for Openstreetmap (osml10n) to get english titles, and it always core dumps. In examining in a debuuger and valgrind it looks like it could be in osm2pgsql, but im unsure. So im submitting here to see.
i also posted on osm2pgsql.. not sure which it could be.. ( i know that might be "bad form" but its hard to tell whats causing this) --> https://github.com/osm2pgsql-dev/osm2pgsql/issues/2117
DETAILS all "make test" worked perfectly the only issue is the core dump.
when i try to use the Localization functions for Openstreetmap (osml10n) it fails with "free(): invalid size Aborted (core dumped)". (see below) Ive tried this on a number of different extracts from geofabrik and did check the md5 on all.
I recognize that its possible that the issue is with the other project, so i ran this in gdb, trying to narrow it down
This didnt show me alot, so i tried valgrind. The error is in a lua utility (lua-utils.cpp:219) and it could be something on the osm2pgsql side. Valgrind details below..
(ill acknowledge that some of these tests are with different PBF files, i lost some of my testing sessions, but all sessions resulted in the same regardless of the PBF file used)
What version of osm2pgsql are you using?
osm2pgsql version 1.10.0 (1.10.0-9-g4a8f42b5)
What operating system and PostgreSQL/PostGIS version are you using?
Ubuntu 22 Linux HOSTNAME 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Tell us something about your system
2 CPU / 10 Cores each / 40 Threads Total bare metal 386 GB Ram.