digital-guard / preserv

Digital Preservation Project
http://git.digital-guard.org/preserv
Apache License 2.0
0 stars 0 forks source link

Bug no gerador de makefile, uso com pg_db especificado de ingestão #50

Closed ppKrauss closed 2 years ago

ppKrauss commented 2 years ago

dump:

peter@oficalNews2018:/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01$ make insert_size pg_db=ingest1
-- Carrega make_conf.yaml na base de dados. --
Uso: make insert_make_conf
pack_id: 29.1
[ENTER para continuar ou ^C para sair]

psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.lix_insert('/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01/make_conf.yaml');"
NOTICE:  ext orig_filename_ext : make_conf.yaml
NOTICE:  ext orig_filename_ext : BR
ERROR:  yaml.parser.ParserError: while parsing a block mapping
  in "<unicode string>", line 1, column 1:
    pack_id: 29.1
    ^
expected <block end>, but found '-'
  in "<unicode string>", line 7, column 1:
    - file: b192fba419ef8133861a9051 ... 
    ^
CONTEXT:  Traceback (most recent call last):
  PL/Python function "yaml_to_jsonb", line 4, in <module>
    return json.dumps( yaml.safe_load(p_yaml) )
  PL/Python function "yaml_to_jsonb", line 161, in safe_load
  PL/Python function "yaml_to_jsonb", line 113, in load
  PL/Python function "yaml_to_jsonb", line 48, in get_single_data
  PL/Python function "yaml_to_jsonb", line 35, in get_single_node
  PL/Python function "yaml_to_jsonb", line 54, in compose_document
  PL/Python function "yaml_to_jsonb", line 83, in compose_node
  PL/Python function "yaml_to_jsonb", line 126, in compose_mapping_node
  PL/Python function "yaml_to_jsonb", line 97, in check_event
  PL/Python function "yaml_to_jsonb", line 437, in parse_block_mapping_key
PL/Python function "yaml_to_jsonb"
PL/pgSQL function ingest.lix_insert(text) line 24 at assignment
make: *** [makefile:91: insert_make_conf] Error 1
0e1 commented 2 years ago

Bug corrigido. a inserção de size estava suprimindo a chave files:, tornando o yaml gerado inválido.

Exemplo de inserção de size e ingestão:

claiton@oficalNews2018:/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01$ make insert_size 
-- Carrega make_conf.yaml na base de dados. --
Uso: make insert_make_conf
pack_id: 29.1
[ENTER para continuar ou ^C para sair]

psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.lix_insert('/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01/make_conf.yaml');"
NOTICE:  ext orig_filename_ext : make_conf.yaml
NOTICE:  ext orig_filename_ext : BR
 lix_insert 
------------

(1 row)

-- Updating make_conf with files size --
psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.lix_generate_make_conf_with_size('BR','29.1');"
 lix_generate_make_conf_with_size 
----------------------------------
 Ok. Content bytes writed:710    +
 See /tmp/pg_io/make_conf_BR29.1
(1 row)

sudo chmod 777 /tmp/pg_io/make_conf_BR29.1
[sudo] password for claiton: 
 Check diff, the '<' lines are the new ones... Something changed?
9,12c9,12
< - file: b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
<   name: Pontos de endereço
<   p: 1
<   size: 1922893
---
>   -
>     p:    1
>     file: b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
>     name: Pontos de endereço
If some changes, and no error in the changes, move the script:
 mv /tmp/pg_io/make_conf_BR29.1 ./make_conf.yaml
[ENTER para rodar mv ou ^C para sair]

mv /tmp/pg_io/make_conf_BR29.1 ./make_conf.yaml
claiton@oficalNews2018:/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01$ make me
-- Carrega make_conf.yaml na base de dados. --
Uso: make insert_make_conf
pack_id: 29.1
[ENTER para continuar ou ^C para sair]

psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.lix_insert('/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01/make_conf.yaml');"
NOTICE:  ext orig_filename_ext : make_conf.yaml
NOTICE:  ext orig_filename_ext : BR
 lix_insert 
------------

(1 row)

-- Updating this make --
psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.lix_generate_makefile('BR','29.1');"
NOTICE:  value of codec_desc_global : {"srid": "31983"}
NOTICE:  layer : geoaddress
NOTICE:  3. codec_desc_default : {"srid": "4326", "charset": "ISO-8859-1"}
NOTICE:  codec resultante : {"srid": "31983", "charset": "ISO-8859-1"}
NOTICE:  codec_extension : shp
     lix_generate_makefile     
-------------------------------
 Ok. Content bytes writed:8221+
 See /tmp/pg_io/makeme_BR29.1
(1 row)

sudo chmod 777 /tmp/pg_io/makeme_BR29.1
 Check diff, the '<' lines are the new ones... Something changed?
1,99d0
< ##
< ## Template file reference: preserv-BR/data/RS/PortoAlegre/_pk027
< ## tplId: 027a
< ##
< tplInputSchema_id=027a
< 
< ## BASIC CONFIG
< srid   =31983
< pg_io  =/tmp/pg_io
< orig   =/var/www/preserv.addressforall.org/download
< pg_uri =postgres://postgres@localhost
< pg_db  =ingest1
< sandbox_root=/tmp/sandbox
< sandbox=$(sandbox_root)/_pkBR291_001
< need_commands= 7z v16+; psql v12+; shp2pgsql v3+; 
< 
< ## COMPOSED VARS
< pg_uri_db   =$(pg_uri)/$(pg_db)
< 
< 
< all:
<   @echo "=== Resumo deste makefile de recuperação de dados preservados ==="
<   @printf "Targets para a geração de layers:\n\tall_layers geoaddress \n"
<   @printf "Demais targets implementados:\n\tmakedirs clean clean_sandbox wget_files me readme delete_file\n"
<   @echo "A geração de layers requer os seguintes comandos e versões:\n\t$(need_commands)"
< 
< all_layers: geoaddress 
<   @echo "--ALL LAYERS--"
< 
< ## ## ## ## ## ## ## ## ##
< ## Make targets of the Project Digital Preservation
< ## Sponsored by Project AddressForAll
< 
< 
< 
< 
< 
< 
< geoaddress: tabname = pk7600002901101_p1_geoaddress
< geoaddress: makedirs $(orig)/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
<   @# pk291_p1 - ETL extrating to PostgreSQL/PostGIS the geoaddress datatype
<   @echo
<   @echo "------------------------------------------"
<   @echo "------ Layer tipo geoaddress_full  ------"
<   @echo "-- Incluindo dados do arquivo-1 do package-7600002901101 na base $(pg_db) --"
<   @echo " Nome-hash do arquivo-1: b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip"
<   @echo " Tabela do layer: pk7600002901101_p1_geoaddress"
<   @echo " Sub-arquivos do arquivo-1 com o conteúdo alvo: *LOTES_PONTO_S2K*"
<   @echo " Tema dos sub-arquivos: Pontos de endereço"
<   @echo "Run with tmux and sudo! (DANGER: seems not idempotent on psql)"
<   @whoami
<   @printf "Above user is root? If not, you have permissions for all paths?\n [press ENTER for yes else ^C]"
<   @read _press_enter_
<   psql $(pg_uri_db) -c "DROP TABLE IF EXISTS pk7600002901101_p1_geoaddress CASCADE"
<   @tput bold
<   @echo Extracting ....
<   @tput sgr0
<   cd $(sandbox); 7z  x -y /var/www/preserv.addressforall.org/download/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip "*LOTES_PONTO_S2K*" ; chmod -R a+rx . > /dev/null
<   @echo "Conferindo se SRID 31983 esta configurado:"
<   @psql $(pg_uri_db) -c "SELECT srid, proj4text FROM spatial_ref_sys where srid=31983"
<   @echo "Tudo bem até aqui?  [ENTER para continuar ou ^C para rodar WS/ingest-step1]"
<   @read _tudo_bem_
<   @echo Executando shp2pgsql ...
<   cd $(sandbox); shp2pgsql -D -W ISO-8859-1  -s 31983 "LOTES_PONTO_S2K.shp" pk7600002901101_p1_geoaddress | psql -q $(pg_uri_db) 2> /dev/null
<   psql $(pg_uri_db) -c "SELECT ingest.any_load('shp2sql','$(sandbox)/LOTES_PONTO_S2K.shp','geoaddress_full','pk7600002901101_p1_geoaddress','7600002901101','b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip',array['gid', 'L_NUMERO AS pointer_number', 'X', 'Y', 'geom'])"
<   @echo "Confira os resultados nas tabelas ingest.donated_packcomponent e ingest.feature_asis".
<   @echo FIM.
< 
< geoaddress-clean:
<   rm -f "$(sandbox)/*LOTES_PONTO_S2K.*" || true
<   psql $(pg_uri_db) -c "DROP TABLE IF EXISTS pk7600002901101_p1_geoaddress CASCADE"
< 
< 
< 
< 
< 
< 
< 
< ## ## ## ## ## ## ## ## ##
< 
< makedirs: clean_sandbox
<   @mkdir -m 777 -p $(sandbox_root)
<   @mkdir -m 777 -p $(sandbox)
<   @mkdir -p $(pg_io)
< 
< wget_files:
<   @echo "Under construction, need to check that orig path is not /var/www! or use orig=x [ENTER if not else ^C]"
<   @echo $(orig)
<   @read _ENTER_OK_
<   mkdir -p $(orig)
<   @cd $(orig); wget http://preserv.addressforall.org/download/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip && chmod o+rw b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
<   @echo "Please, if orig not default, run 'make _target_ orig=$(orig)'"
< 
< ## ## ## ## ## ## ## ## ##
< 
< clean_sandbox:
<   @rm -rf $(sandbox) || true
< 
< clean: geoaddress-clean 
If some changes, and no error in the changes, move the script:
 mv /tmp/pg_io/makeme_BR29.1 ./makefile
[ENTER para rodar mv ou ^C para sair]

mv /tmp/pg_io/makeme_BR29.1 ./makefile
claiton@oficalNews2018:/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01$ make all_layers 

------------------------------------------
------ Layer tipo geoaddress_full  ------
-- Incluindo dados do arquivo-1 do package-7600002901101 na base ingest1 --
 Nome-hash do arquivo-1: b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
 Tabela do layer: pk7600002901101_p1_geoaddress
 Sub-arquivos do arquivo-1 com o conteúdo alvo: *LOTES_PONTO_S2K*
 Tema dos sub-arquivos: Pontos de endereço
Run with tmux and sudo! (DANGER: seems not idempotent on psql)
claiton
Above user is root? If not, you have permissions for all paths?
 [press ENTER for yes else ^C]
psql postgres://postgres@localhost/ingest1 -c "DROP TABLE IF EXISTS pk7600002901101_p1_geoaddress CASCADE"
NOTICE:  table "pk7600002901101_p1_geoaddress" does not exist, skipping
DROP TABLE
Extracting ....
cd /tmp/sandbox/_pkBR291_001; 7z  x -y /var/www/preserv.addressforall.org/download/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip "*LOTES_PONTO_S2K*" ; chmod -R a+rx . > /dev/null

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (406F1),ASM,AES-NI)

Scanning the drive for archives:
1 file, 1922893 bytes (1878 KiB)                      

Extracting archive: /var/www/preserv.addressforall.org/download/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
--
Path = /var/www/preserv.addressforall.org/download/b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip
Type = zip
Physical Size = 1922893

Everything is Ok

Files: 5
Size:       15578172
Compressed: 1922893
Conferindo se SRID 31983 esta configurado:
 srid  |                                    proj4text                                     
-------+----------------------------------------------------------------------------------
 31983 | +proj=utm +zone=23 +south +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
(1 row)

Tudo bem até aqui?  [ENTER para continuar ou ^C para rodar WS/ingest-step1]

Executando shp2pgsql ...
cd /tmp/sandbox/_pkBR291_001; shp2pgsql -D -W ISO-8859-1  -s 31983 "LOTES_PONTO_S2K.shp" pk7600002901101_p1_geoaddress | psql -q postgres://postgres@localhost/ingest1 2> /dev/null
Field l_numero is an FTDouble with width 10 and precision 0
Field x is an FTDouble with width 9 and precision 5
Field y is an FTDouble with width 9 and precision 5
Shapefile type: Point
Postgis type: POINT[2]
                            addgeometrycolumn                            
-------------------------------------------------------------------------
 public.pk7600002901101_p1_geoaddress.geom SRID:31983 TYPE:POINT DIMS:2 
(1 row)

psql postgres://postgres@localhost/ingest1 -c "SELECT ingest.any_load('shp2sql','/tmp/sandbox/_pkBR291_001/LOTES_PONTO_S2K.shp','geoaddress_full','pk7600002901101_p1_geoaddress','7600002901101','b192fba419ef8133861a9051d2382d08476193eafbd8932f0ea05456157c301c.zip',array['gid', 'L_NUMERO AS pointer_number', 'X', 'Y', 'geom'])"
                   any_load                   
----------------------------------------------
 From file_id=1 inserted type=geoaddress_full+
 in feature_asis 43389 items.
(1 row)

Confira os resultados nas tabelas ingest.donated_packcomponent e ingest.feature_asis.
FIM.
--ALL LAYERS--
claiton@oficalNews2018:/var/gits/_dg/preserv-BR/data/SP/Santos/_pk0029.01$