digital-guard / preserv-BR

Digital Preservation of Brazilian metadata of donated maps.
http://git.digital-guard.org/preserv-BR
3 stars 1 forks source link

ERROR: column reference "geom" is ambiguous #141

Closed crebollobr closed 1 year ago

crebollobr commented 1 year ago

Não consegui identificar o erro no sql link https://github.com/digital-guard/preserv-BR/tree/main/data/BA/Salvador/_pk0050.01 make parcel pg_db=ingest20

carlos@addressforall:/var/gits/_dg/preserv-BR/data/BA/Salvador/_pk0050.01$ make parcel pg_db=ingest20 
------------------------------------------
------ BR-BA-Salvador  ------
------ Layer tipo parcel_none  ------
-- Incluindo dados do arquivo-3 do package-7600005001301 na base ingest20 --
 Nome-hash do arquivo-3: e8326aac43117a60b0f6294ecc579e7c2bdaee704206a86d03edcff9a5f9db04.zip
 Tabela do layer: pk7600005001301_p3_parcel
 Sub-arquivos do arquivo-3 com o conteúdo alvo: *edf_edificacao_a.dxf*
 Tema dos sub-arquivos: Edificacao_sem_acentos.zip
Run with tmux and sudo! (DANGER: seems not idempotent on psql)
carlos
Above user is root? If not, you have permissions for all paths?
[Press ENTER to continue or Ctrl+C to quit]
psql postgres://postgres@localhost/ingest20 -c "DROP  TABLE IF EXISTS pk7600005001301_p3_parcel CASCADE"
DROP TABLE
Extracting ....
cd /tmp/sandbox/_pk7600005001_001; 7z  x -y /var/www/preserv.addressforall.org/download/e8326aac43117a60b0f6294ecc579e7c2bdaee704206a86d03edcff9a5f9db04.zip "*edf_edificacao_a.dxf*" ; chmod -R a+rwx . > /dev/null

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=pt_BR.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs DO-Regular (306F2),ASM,AES-NI)

Scanning the drive for archives:
1 file, 226002892 bytes (216 MiB)                     

Extracting archive: /var/www/preserv.addressforall.org/download/e8326aac43117a60b0f6294ecc579e7c2bdaee704206a86d03edcff9a5f9db04.zip
--
Path = /var/www/preserv.addressforall.org/download/e8326aac43117a60b0f6294ecc579e7c2bdaee704206a86d03edcff9a5f9db04.zip
Type = zip
Physical Size = 226002892

Everything is Ok           

Size:       640186537
Compressed: 226002892
Conferindo se SRID 31984 está configurado:
 srid  |                                    proj4text                                     
-------+----------------------------------------------------------------------------------
 31984 | +proj=utm +zone=24 +south +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
(1 row)

[Press ENTER to continue or Ctrl+C to quit]
Executando ogr2ogr ...
sudo docker run --rm --network host -v /tmp/sandbox/_pk7600005001_001:/tmp osgeo/gdal ogr2ogr -lco GEOMETRY_NAME=geom -overwrite -f "PostgreSQL" PG:" dbname='ingest20' host='localhost' port='5432' user='postgres' " "/tmp/edf_edificacao_a.dxf"  -nln pk7600005001301_p3_parcel 
psql postgres://postgres@localhost/ingest20 -c "SELECT ingest.any_load('ogr2ogr','/tmp/sandbox/_pk7600005001_001/edf_edificacao_a.dxf','parcel_none','pk7600005001301_p3_parcel','7600005001301','e8326aac43117a60b0f6294ecc579e7c2bdaee704206a86d03edcff9a5f9db04.zip',array['row_number() OVER () as gid', 'ST_MakePolygon(geom) as geom'],5,1)"
ERROR:  column reference "geom" is ambiguous
LINE 18:                  geom -- geom
                          ^
QUERY:  
      WITH
      scan AS (
        SELECT 19 AS file_id,
                CASE
                 WHEN true THEN gid
                 ELSE gid+COALESCE((SELECT MAX(feature_id) FROM ingest.feature_asis WHERE file_id=19),0)
               END AS gid,
        properties,
               CASE
                 WHEN ST_SRID(geom)=0 THEN ST_SetSRID(geom,4326)
                 WHEN true AND ST_SRID(geom)!=4326 AND true THEN ST_Transform(geom,4326)
                 ELSE geom
               END AS geom
        FROM (
            SELECT row_number() OVER () AS gid,  -- feature_id_col
                 to_jsonb(subq) as properties,
                 geom -- geom
            FROM pk7600005001301_p3_parcel , LATERAL (SELECT row_number() OVER () AS gid,ST_MakePolygon(geom) AS geom) subq
          ) t
      ),
      a0 AS (
        SELECT *
        FROM scan
        WHERE ST_IsValid(geom)
      ),
      a AS (
        SELECT *
        FROM a0
        WHERE ST_IsClosed(geom) = TRUE OR GeometryType(geom) IN ('LINESTRING','MULTILINESTRING')
      ),
      mask AS (SELECT ingest.buffer_geom(geom,1) AS geom FROM ingest.vw01full_jurisdiction_geom WHERE isolabel_ext=(SELECT isolabel_ext FROM ingest.vw01full_packfilevers WHERE id=7600005001301)),
      b AS (
        SELECT file_id, gid, properties, geom, ( B'000000000' ||  (NOT(ST_IsSimple(geom)))::int::bit || (NOT(ST_IsValid(geom)))::int::bit || (NOT(ST_Intersects(geom,(SELECT geom FROM mask))))::int::bit ) AS error_mask
        FROM a
      ),
      c AS (
        (
        SELECT file_id, gid, properties, geom, (error_mask | (
            B'00' ||
            (CASE (SELECT (ingest.donated_PackComponent_geomtype(19))[1]) WHEN 'poly' THEN ST_Area(geom,true) > 2147483647 WHEN 'line' THEN ST_Length(geom,true) > 2147483647 ELSE FALSE END)::int::bit ||
            B'00' ||
            (GeometryType(geom) NOT IN ('POLYGON'   ,'MULTIPOLYGON'))::int::bit ||
            (geom IS NULL)::int::bit ||
            (CASE (SELECT (ingest.donated_PackComponent_geomtype(19))[1]) WHEN 'poly' THEN ST_Area(geom,true) < 5 WHEN 'line' THEN ST_Length(geom,true) < 2 ELSE FALSE END)::int::bit ||
            (ST_IsEmpty(geom))::int::bit ||
            B'000' )) AS error_mask
        FROM (
           SELECT file_id, gid,
                  properties,
                  CASE (SELECT (ingest.donated_PackComponent_geomtype(19))[1])
                    WHEN 'point' THEN ST_ReducePrecision( ST_Intersection(geom,(SELECT geom FROM mask)), 0.000001 )
                    ELSE ST_SimplifyPreserveTopology( -- remove collinear points 
                            ST_ReducePrecision( -- round decimal degrees of SRID 4326, ~1 meter
                              ST_Intersection( geom, (SELECT geom FROM mask) )
                              ,0.000001
                            ),
                            0.00000001
                    )
                  END AS geom,
                  error_mask
           FROM b
           WHERE bit_count(error_mask) = 0 
           ) t
        )
        UNION
        (
            SELECT * FROM b WHERE bit_count(error_mask) <> 0
        )
        UNION
        (
            SELECT file_id, gid, properties, ST_MakeValid(geom) AS geom, B'000100000000' AS error_mask
            FROM a0
            WHERE ST_IsClosed(geom) = FALSE AND GeometryType(geom) NOT IN ('LINESTRING','MULTILINESTRING')
        )
        UNION
        (
            SELECT file_id, gid, properties, ST_MakeValid(geom) AS geom, B'000000000010' AS error_mask
            FROM scan
            WHERE ST_IsValid(geom) = FALSE
        )
      ),
      stats AS (
      SELECT ARRAY [
            (SELECT COUNT(*) FROM scan)::bigint,
            (COUNT(*) filter (WHERE get_bit(error_mask,11) = 1))::bigint, -- intersects
            (COUNT(*) filter (WHERE get_bit(error_mask,10) = 1))::bigint, -- invalid
            (COUNT(*) filter (WHERE get_bit(error_mask, 9) = 1))::bigint, -- simple
            (COUNT(*) filter (WHERE get_bit(error_mask, 8) = 1))::bigint, -- empty
            (COUNT(*) filter (WHERE get_bit(error_mask, 7) = 1))::bigint, -- small
            (COUNT(*) filter (WHERE get_bit(error_mask, 6) = 1))::bigint, -- null
            (COUNT(*) filter (WHERE get_bit(error_mask, 5) = 1))::bigint, -- invalid_type
                                                                          -- bit 4 é reservado para duplicados
            (COUNT(*) filter (WHERE get_bit(error_mask, 3) = 1))::bigint, -- is_closed
            (COUNT(*) filter (WHERE get_bit(error_mask, 2) = 1))::bigint  -- large
                                                                          -- bit 1 uso futuro
                                                                          -- bit 0 uso futuro
            ]
        FROM c
      ),
      ins_asis AS (
        INSERT INTO ingest.feature_asis
        SELECT file_id, gid, properties, geom
        FROM c
            WHERE  bit_count(error_mask) = 0
        RETURNING 1
      ),
      ins_asis_discarded AS (
        INSERT INTO ingest.feature_asis_discarded
        SELECT file_id, gid, properties || jsonb_build_object('error_mask',error_mask), geom
        FROM c
            WHERE  bit_count(error_mask) <> 0
        ON CONFLICT (file_id, feature_id)
        DO UPDATE
        SET properties = EXCLUDED.properties, geom = EXCLUDED.geom
        RETURNING 1
      )
      SELECT array_append(array_append( (SELECT * FROM stats), (SELECT COUNT(*) FROM ins_asis) ), (SELECT COUNT(*) FROM ins_asis_discarded))

CONTEXT:  PL/pgSQL function ingest.any_load(text,text,text,text,bigint,text,text[],integer,integer,boolean,text,boolean,text,text) line 221 at EXECUTE
make: *** [makefile:155: parcel] Error 1
0e1 commented 1 year ago

Situação excelente pra exemplificar a diferença entre sql_view e sql_select.

Simplificadamente, é isso que acontece com um arquivo:

  1. Seja file1 um arquivo;
  2. file1 é convertido para tabname;

A partir daí, dois caminhos são possíveis:

  1. Ao usar sql_view: 3.1. CREATE wv_file1 AS sql_view FROM tabname; 3.2. any_load é aplicada em wv_file1;

ou

  1. Ao usar sql_select: 3.1 any_load é aplicada em tabname LATERAL sql_select;

Por isso, se usar sql_select o conteúdo do array não pode criar novos nomes que conflitem com as colunas já existentes de tabname (geom já existe em tabname então, ST_MakePolygon(geom) AS geom cria outra igual. Daí a ambiguidade). Ainda, sempre é bom lembrar que é sql_select ou (exclusivo) sql_view.

Então, no caso em questão, você deve usar sql_view:

sql_view: SELECT row_number() OVER () AS gid, ST_MakePolygon(geom) AS geom FROM $(tabname)

Mas, o arquivo em questão é um .dxf. No meu teste o programa ogr2ogr carrega o arquivo de tal maneira que ST_SetSRID(geom)=0. Isso evidencia que o ogr2ogr não reconhece o SRID em todos os casos, como foi suposto na implementação do método ogr2ogr. Essa suposição implicou em não usar os valores de SRID oriundos de codec:descr_encode e de codec na chamada do programa ogr2ogr.

Então, três possibilidades para os casos onde o programa ogr2ogr não reconhece o SRID:

  1. sql_view: SELECT row_number() OVER () AS gid, ST_SetSRID(ST_MakePolygon(geom),31984) AS geom FROM $(tabname)
  2. sql_view: SELECT row_number() OVER () AS gid, ST_MakePolygon(geom) AS geom FROM $(tabname)
    method_opts: -a_srs EPSG:31984

    Ver doc de a_srs em https://gdal.org/programs/ogr2ogr.html#cmdoption-ogr2ogr-a_srs.

  3. Usar a_srs em todos os casos, consumindo os valores de codec:descr_encode e de codec

    sql_view: SELECT row_number() OVER () AS gid, ST_MakePolygon(geom) AS geom FROM $(tabname)
0e1 commented 1 year ago

Notar que implementei a opção número 3, citada acima.

Então, você deve usar, apenas:

sql_view: SELECT row_number() OVER () AS gid, ST_MakePolygon(geom) AS geom FROM $(tabname)