glourencoffee / pycvm

Python library for processing data from CVM
MIT License
2 stars 0 forks source link

Reading FCA results in invalid countries #12

Closed glourencoffee closed 2 years ago

glourencoffee commented 2 years ago

Description

Reading FCA files shows messages saying that some countries are invalid.

Steps to reproduce

  1. Download the FCA of 2017
  2. Open it with cvm.fca_reader()
  3. Iterate through it until exhaustion

Expected behavior

Reading countries from FCA documents shouldn't result in any failure messages. Country fields should be thus optional. If a country is valid, a Country object corresponding to that country should be created. Otherwise, a warning message should be printed to inform that a country name is unknown, and None should be used instead.

This library may also interpret typos, such as "Brasi" and "Espanhã", and translate them to their corresponding Country.

Actual behavior

FCA of 2017 results in the following messages related to invalid country, some of which are errors:

Skipping line 2 in trading admissions batch 65209: failed to create object from value 'Inglaterra' at field 'Pais': 'Inglaterra' is not a valid Country
Skipping line 1 in addresses batch 62885: failed to create object from value 'Espanhã' at field 'Pais': 'Espanhã' is not a valid Country
Skipping line 2 in addresses batch 62885: failed to create object from value 'Espanhã' at field 'Pais': 'Espanhã' is not a valid Country
Error while reading issuer company: failed to create object from value 'BR' at field 'Pais_Custodia_Valores_Mobiliarios': 'BR' is not a valid Country
Error while reading issuer company: failed to create object from value '...............' at field 'Pais_Custodia_Valores_Mobiliarios': '...............' is not a valid Country
Error while reading issuer company: failed to create object from value 'Não aplicável' at field 'Pais_Custodia_Valores_Mobiliarios': 'Não aplicável' is not a valid Country
Skipping line 1 in addresses batch 63634: failed to create object from value 'Brasi' at field 'Pais': 'Brasi' is not a valid Country
Error while reading issuer company: failed to create object from value 'N/A' at field 'Pais_Custodia_Valores_Mobiliarios': 'N/A' is not a valid Country
Error while reading issuer company: failed to create object from value 'Não aplicável' at field 'Pais_Custodia_Valores_Mobiliarios': 'Não aplicável' is not a valid Country
Skipping line 3 in trading admissions batch 66027: failed to create object from value 'EUA' at field 'Pais': 'EUA' is not a valid Country

Additional context

This is another bug like callmegiorgio/pycvm#9 which is not coming from this library, but rather from CVM's side. It seems the ENET software does not enforce valid name for countries, which results in country names such as "Espanhã", "Brasi", or even "..............." (LOL).