gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
31 stars 14 forks source link

clb importer mybatis exception #38

Closed mdoering closed 7 years ago

mdoering commented 7 years ago
java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x00
### The error may exist in org/gbif/checklistbank/service/mybatis/mapper/ParsedNameMapper.xml
### The error may involve org.gbif.checklistbank.service.mybatis.mapper.ParsedNameMapper.getByName-Inline
### The error occurred while setting parameters
### SQL: SELECT          n.id, n.scientific_name, n.canonical_name, n.type,     n.genus_or_above, n.infra_generic, n.specific_epithet, n.infra_specific_epithet, n.cultivar_epithet,     n.notho_type, n.rank, n.nom_status, n.sensu, n.remarks,     n.authors_parsed, n.parsed, n.authorship, n.year, n.bracket_authorship, n.bracket_year             FROM               name n             WHERE n.scientific_name=? AND n.rank=?::rank
### Cause: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x00
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.gbif.checklistbank.cli.importer.Importer.run(Importer.java:142)
mdoering commented 7 years ago

this happened again when importing the new IRMNG. Is the null character encountered in this dataset for the first time? We would need to replace the null with a regular space or nothing in java beforehand:

myValue.replaceAll("\u0000", "")

mdoering commented 7 years ago

IRMNG contains a null char in this line: 1144985 urn:lsid:irmng.org:taxname:1144985 1144985 100871 Paratiberioides Paratiberioides Passalidae Elytra 22 (2), Nov 15 Animalia Arthropoda Insecta Coleoptera Passalidae Paratiberioides Genus Iwase, 1994 ICZN accepted 2012-01-01 IRMNG (2012). Paratiberioides Iwase?, 1994. Accessed through: The Interim Register of Marine and Nonmarine Genera at http://www.irmng.org/aphia.php?p=taxdetails&id=1144985 http://www.irmng.org/aphia.php?p=taxdetails&id=1144985

mdoering commented 7 years ago

fixed in https://github.com/gbif/checklistbank/commit/945be6812df06990edf28261d0082cc78d109ddb