dchaplinsky / german_registry_parser

Will the fun ever stop
3 stars 1 forks source link

Officer text appearing in `company` and `address` tags #9

Open skenaja opened 6 years ago

skenaja commented 6 years ago

https://github.com/dchaplinsky/german_registry_parser/blob/4c7a2801212e0ef19c41f57d9a82dcd2fac828ad/parsing_results/sample2/120052.json#L36-L40

dchaplinsky commented 6 years ago

Yup, the problem here is the absence of separator between address and Geschäftsführer: I'll try to fix such corner cases when improving parsing in general

skenaja commented 6 years ago

I think f5c1739a826889959c529d5517b704a0eda8b292 partially fixed these.

Check out notice be-108209 where the parser thinks that the company partner is a perrson, and the output looks like this:

"officers": [
      {
        "class": "PersonalPartner",
        "payload": {
          "city": "HRB 121788 B)",
          "lastname": "Departmentstore Quartier 206 Verwaltungs-GmbH",
          "name": "Berlin (Amtsgericht Charlottenburg",
          "ref": 1
        },
        "text": "1) Departmentstore Quartier 206 Verwaltungs-GmbH, Berlin (Amtsgericht Charlottenburg, HRB 121788 B)"
      }
    ]