UHaifa-IS / whgazetteer-mehdie

World Historical Gazetteer - MEHDIE version
http://whgazetteer.org
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

locations are confused #173

Closed sinairusinek closed 8 months ago

sinairusinek commented 8 months ago
Screen Shot 2024-02-06 at 7 26 46 Screen Shot 2024-02-06 at 7 26 31

Why are there four colors?

  1. many places got false coordinates for unknown reason. For example: Screen Shot 2024-02-06 at 12 27 59 Frisia, which is correctly linked to a Kima and is a region in Europe, had no coordinates in the original file. For some reason it shows as appearing in Egypt, with a link that doesn't open to Yaqut. Also, the link to Kima which works appears as "data". Another example: Hungary got the coordinates for Istanbul. It had no coordinates in the original file. This happens with many other places in this file. Lucca in Italy had the right coordinates 43.840677 10.50602 and appears instead in Belarus:

Screen Shot 2024-02-06 at 12 51 34 This happens both in the browsing and in the matching maps.

tomersagi commented 8 months ago

Seems that saving the "different" links is causing the system to save new lines in the place_geom table for the false matches.

tomersagi commented 8 months ago

Also here, and also the "different" links appear again in "original" for some reason:

image

tomersagi commented 8 months ago

To fix the falsely added geoms we can use a query like this one:

SELECT pg.id , td.dataset_2::int as other_dataset ,p.id as other_place_id,
    pl."jsonb"->>'type' as relation_type, pl."jsonb" , pg."jsonb" 
FROM 
    place_geom pg 
JOIN 
    place_link pl 
ON 
    pg.task_id = pl.task_id 
join public.task_dsids TD on TD.task_id = pg.task_id join places p on p.src_id = replace((pg."jsonb"->'citation')::jsonb->>'id','match_data:','') join datasets d on p.dataset = d."label" 
where 
    pg.task_id = '031198fe-89f5-4b72-9f65-88e86924b78e' and d.id = td.dataset_2::int and split_part(pl."jsonb"->>'identifier',':',2)::int = p.id and (pl."jsonb"->>'type' is null or pl."jsonb"->>'type' = 'different');
tomersagi commented 8 months ago
tomersagi commented 8 months ago

Some of the confusion is coming from the fact that the places have matches in different tasks. The review screen should only show the locations on the map of the places beeing reviewed in this task.

image

This is the background table place:geoms:

image

But the data from here should only be taken for the place under review. For the suggested matches, the data should be taken from the hits table:

image

tomersagi commented 8 months ago

Seems that the script elements container is being processed into the geoms element which is then used to populate the map:

for (i = 0; i < gelems.length; i++) {
                console.log("Processing script element:", gelems[i].id, gelems[i].text);
                let t_geom = cleanJson(gelems[i].text)
                t_geom['properties'] = {"id": gelems[i].id, "ds": t_geom.ds != null ? t_geom.ds : ds}
                geom['features'].push(t_geom)
            }

image

tomersagi commented 8 months ago

But there are five elements being processed instead of three. One extra element is from the wrong task.

tomersagi commented 8 months ago

The plot thickens... it seems that the problem is coming from the hits table. Each hit should represent a single match candidate, but some of these match candidates where previously mapped to other datasets and therefore acquired superfluous location information in their place_geoms table (this was fixed earlier in this issue).

tomersagi commented 8 months ago

This is an example of a problematic geoms element:

{"geoms": 
    [
        {"ds": "md", "id": "U38", "type": "Point", "citation": 
                                                        {"id": "match_data:MILA_062E364N_S", "label": "WHG"}
         , "coordinates": [6.2883, 36.4248]
        }
        ,{"ds": "md", "id": "U38", "type": "Point", "citation": {"id": "match_data:MILA_062E364N_S", "label": "WHG"}, 
        "coordinates": [6.2883, 36.4248]
        }, 
        {"ds": "md", "id": "U38", "type": "Point", "citation": {"id": "match_data:dig05341", "label": "WHG"}, 
        "coordinates": [57.3, 22.96666667]
        }

This is from task_id = 'e26d945b-37a1-4ace-b98f-471e4ecd3b8c' and place_id = 159469 (matching DamastArabic to

tomersagi commented 8 months ago

The next step is to find a way to populate a javascript variable (window.other_geoms ?) from the hit_supplemental and then the next step is to populate the map with these variables.

tomersagi commented 8 months ago

Found the piece of code currently generating small script tags that get picked up by the filter:


                                </div>
                                {% for g in record.geoms.all %}
                                    {{ g.jsonb|safe|json_script:record.id }}
                                {% endfor %}
                                <div class="record" id="place_detail">

Need to modify this to take the geoms from the hits_supplemental and to mark it in a better way for the jquery to pick up becuase currently it is picking up stuff like the dashlane plugin.

tomersagi commented 8 months ago

replaced the code above such that only a single geom is picked up for the target place and this is taken from the geom_as_json context variable.

tomersagi commented 8 months ago

ok, replaced the code for the candidate place mapping but this did something strange adding a point for some other place - 95134. The console shows only one place getting added when processing the hits. Should be two places and their ids should match the flash marker id.

tomersagi commented 8 months ago

@sinairusinek the problem is mostly solved. Two issues remain:

tomersagi commented 8 months ago

Review screen fixed