osm-fr / osmose-frontend

Part of osmose that shows results on the website http://osmose.openstreetmap.fr
GNU General Public License v3.0
41 stars 39 forks source link

Lock contention #461

Closed anayrat closed 11 months ago

anayrat commented 1 year ago

Hello,

I dig a little bit to understand why we got lock contention between insert .. on conflict ... update.

Here is a quick example :

create table t1 (c1 int primary key);
insert into t1 select generate_series(1,10000);

Then :

-- first session :
begin;
insert into t1 values (1) on conflict (c1) do update set c1=10001 returning *;
  c1
-------
 10001
(1 row)

INSERT 0 1
-- the transaction is not commited. We can expect other insert occurring in the same transaction

Second session :

insert into t1 values (1) on conflict (c1) do update set c1=10001 returning *; -- This insert is stuck because c1 = 1 is locked by first session

insert into t1 values (2) on conflict (c1) do update set c1=10001 returning *; -- Same here because the new c1 = 1001 is also locked

insert into t1 values (2) on conflict (c1) do update set c1=10002 returning *; -- c1 =2 and c1=1002 is not locked, this insert success

I extracted a transaction from the log extract.log :

First, we got a begin:

2023-05-16 11:53:44.049 UTC [2014258-160]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=0 LOG:  duration: 0.023 ms  statement: BEGIN;

Later, the session took a transaction id (xid=323772555) :

2023-05-16 11:53:44.258 UTC [2014258-167]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772555 LOG:  duration: 0.831 ms  execute __asyncpg_stmt_3869__:
  INSERT INTO updates
      (source_id, timestamp, remote_url, remote_ip, version, analyser_version)
  VALUES
      ($1, to_timestamp($2), $3, $4, $5, $6)
  ON CONFLICT DO NOTHING
  RETURNING 1

Then we had thousand insert etc

Later, another session tried to insert a record without success:

2023-05-16 11:53:48.681 UTC [2014242-16]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 LOG:  process 2014242 still waiting for ShareLock on transaction 323772555 after 1000.118 ms
2023-05-16 11:53:48.681 UTC [2014242-17]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 DETAIL:  Process holding the lock: 2014258. Wait queue: 2014242.
2023-05-16 11:53:48.681 UTC [2014242-18]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 CONTEXT:  while locking tuple (175,10) in relation "class"
2023-05-16 11:53:48.681 UTC [2014242-19]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 STATEMENT:  
    INSERT INTO class (class, item, title, level, tags, detail, fix, trap, example, source, resource, timestamp)

This session remained stuck until transaction 323772555 had been commited :

2023-05-16 11:53:49.718 UTC [2014258-16417]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=0 LOG:  duration: 0.416 ms  statement: COMMIT;
2023-05-16 11:53:49.718 UTC [2014242-20]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 LOG:  process 2014242 acquired ShareLock on transaction 323772555 after 2037.724 ms
2023-05-16 11:53:49.718 UTC [2014242-21]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772558 CONTEXT:  while locking tuple (175,10) in relation "class"

Both sessions tried to update the same record :

class.class = $1 AND
class.item = $2 AND

Parameters : $1 = '20', $2 = '8300'

Unfortunately, this record was already locked line 110:

2023-05-16 11:53:44.259 UTC [2014258-175]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772555 LOG:  duration: 0.109 ms  execute __asyncpg_stmt_386b__: 
    INSERT INTO class (class, item, title, level, tags, detail, fix, trap, example, source, resource, timestamp)
    VALUES
        ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, to_timestamp($12))
    ON CONFLICT (item, class) DO
    UPDATE SET
            title = $3,
            level = $4,
            tags = $5,
            detail = $6,
            fix = $7,
            trap = $8,
            example = $9,
            source = $10,
            resource = $11,
            timestamp = to_timestamp($12)
    WHERE
        class.class = $1 AND
        class.item = $2 AND
        class.timestamp < to_timestamp($12) AND
        (
            class.title IS DISTINCT FROM $3 OR
            class.level IS DISTINCT FROM $4 OR
            class.tags IS DISTINCT FROM $5::varchar[] OR
            class.detail IS DISTINCT FROM $6 OR
            class.fix IS DISTINCT FROM $7 OR
            class.trap IS DISTINCT FROM $8 OR
            class.example IS DISTINCT FROM $9 OR
            class.source IS DISTINCT FROM $10 OR
            class.resource IS DISTINCT FROM $11
        )

2023-05-16 11:53:44.259 UTC [2014258-176]: db=osmose_frontend,user=osmose,app=[unknown],client=[local],xid=323772555 DETAIL:  parameters: $1 = '20', $2 = '8300', $3 = '{"de": "Maximalhöhe fehlt", "en": "Unmapped max height limit", "es": "límite de altura máxima no mapeado", "fr": "Hauteur limite maximale non cartographié", "it": "max height limit non mappato", "ja": "最大高制限がマッピングされていない", "nl": "Unmapped max hoogte limiet", "pl": "Do zmapowania: ograniczenie wysokości", "uk": "Незамаплене max height limit"}', $4 = '2', $5 = '{merge,highway,fix:picture,fix:survey}', $6 = '{"de": "Der Hinweis stammt aus einer offenen Datenquelle, aber das reicht nicht aus, um die Qualität der Daten zu gewährleisten. Überprüfe den Hinweis vor Integration der Daten. Nehme keine blinden Importe in OSM vor, sondern prüfe die Datenintegration kritisch.\n\nDie Daten stammen aus einer offenen Datenquelle, ohne dass sie zuvor individuell überprüft wurden.\n\nVerkehrszeichen (Maximalhöhe) von Mapillary erkannt, aber kein \n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*` in der Nähe vorhanden.", "en": "It is from an open data source, but that is not enough to ensure the quality\nof the data. Review it before integrating the data. You must not do blind imports\ninto OSM, you must do critical review of data integration.\n\nThis is reported from an open data source, without any prior individual\nverification of this data.\n\nTraffic sign (max height limit) detected by Mapillary, but no nearby tagging of any:\n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*`", "fr": "It is from an open data source, but that is not enough to ensure the quality\nof the data. Review it before integrating the data. You must not do blind imports\ninto OSM, you must do critical review of data integration.\n\nCette information provient d''une source de données ouverte, sans aucune vérification au préalable d''un individu.\n\nTraffic sign (max height limit) detected by Mapillary, but no nearby tagging of any:\n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*`", "it": "Proviene da una fonte di dati liberi, ma ciò non è sufficiente a garantire\nla qualità del dato. Verificalo prima di inserirlo. Non devi effettuare delle\nimportazioni alla cieca su OSM ma verificare scrupolosamente i dati prima di integrarli.\n\nThis is reported from an open data source, without any prior individual\nverification of this data.\n\nSegnale stradale (max height limit) rilevato da Mapillary ma nei dintorni nessuno elemento etichettato con: \n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*`", "ja": "これはオープンデータに由来するものですが、データの品質は十分に検証されていません。取り込む前に十分レビューしてください。やみくもにOSMに取り込んではいけません。データの取り込みは吟味しながら行う必要があります。\n\nこれはオープンデータの情報源に報告されているもので、このデータに関する検証は行われていません。\n\n最大高制限の交通標識がMapillaryにより検出されましたが、以下のいずれかのタグ付けが近隣にありません:\n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*`", "nl": "It is from an open data source, but that is not enough to ensure the quality\nof the data. Review it before integrating the data. You must not do blind imports\ninto OSM, you must do critical review of data integration.\n\nThis is reported from an open data source, without any prior individual\nverification of this data.\n\nVerkeersbord (max hoogte limiet) gedetecteerd door Mapillary, maar geen tagging in de buurt van: \n\n- `barrier=*` + `maxheight=*`\n- `highway=*` + `maxheight=*`\n- `amenity=parking` + `maxheight=*`"}', $7 = '{"de": "Wenn nach Prüfung sichergestellt werden kann, dass es sich um neue Daten handelt\nund sie für OpenStreetMap geeignet sind, dann können die Daten hinzufügt werden.\n\nErgänze das entsprechende Verkehrsschild, wenn die Bildquelle aktuell und die Zeichenerkennung korrekt ist.", "en": "If after review you are sure that it is new data and right for\nOpenStreetMap, then you can add it.\n\nAdd the appropriate highway tagging if the imagery is up-to-date and sign detection is correct.", "es": "If after review you are sure that it is new data and right for\nOpenStreetMap, then you can add it.\n\nAgregue el etiquetado de highway apropiado si las imágenes están actualizadas y la detección de señales es correcta.", "fr": "Si après vérification vous êtes sûr qu''il s''agit d''une nouvelle donnée et que celle-ci est appropriée pour OpenStreetMap,  alors vous pouvez l''ajouter.\n\nAdd the appropriate highway tagging if the imagery is up-to-date and sign detection is correct.", "ja": "レビューした結果、新しいデータでOpenStreetMapに登録すべきデータである場合には、追加してください。\n\n画像が新しく、標識の検出が正しければ、適切な道路のタグ付けを追加してください。", "nl": "Als u na een beoordeling zeker weet dat het nieuwe gegevens zijn en geschikt zijn voor OpenStreetMap, dan kunt u het toevoegen.\n\nAdd the appropriate highway tagging if the imagery is up-to-date and sign detection is correct."}', $8 = '{"de": "Stelle sicher, dass es nicht bereits an einem anderen Ort existiert.", "en": "Be sure that it does not already exist in another place.", "es": "Asegúrese de que no exista ya en otro lugar.", "fr": "Assurez-vous que l''objet n''existe pas ailleurs.", "ja": "付近の別の場所に既に存在していないことを十分確認してください。", "nl": "Zorg ervoor dat het niet al op een andere plaats bestaat.", "pl": "Upewnij się, że nie istnieje już w innym miejscu."}', $9 = NULL, $10 = 'https://github.com/osm-fr/osmose-backend/blob/master/analysers/analyser_merge_traffic_signs.py#L82', $11 = NULL, $12 = '1684236038'

You should rethink processing to avoid insert line by line. It is not efficient.

One idea could be :

FYI, I created these indexes to cover foreign key :

CREATE INDEX CONCURRENTLY idx_items_categorie_id ON items (categorie_id);
CREATE INDEX CONCURRENTLY idx_markers_counts_item_class ON markers_counts (item,class);
CREATE INDEX CONCURRENTLY idx_markers_status_item_class ON markers_status (item,class);

(this is not related to this issue)

frodrigo commented 11 months ago

Use latter upsert in transaction.