olemb / dbfread

Read DBF Files with Python
MIT License
219 stars 91 forks source link

Support for DBC (compressed DBF) files #37

Open augusto-herrmann opened 4 years ago

augusto-herrmann commented 4 years ago

Would it make sense for dbfread to support compressed DBF files – DBC files?

It would be similar to what the read.dbc package does in R. I couldn't find a package that does the same in Python. I've tried dbfread and it currently only reads uncompressed DBF files.

If you need DBC files for testing, there are lots of them at the DATASUS website (official public healthcare system statistics from Brazil).

zaneselvans commented 4 years ago

Oh that's what the DBC files are. I've got one which I've been grepping for strings to regenerate the names of all the tables and fields in a database from the US Federal Energy Regulatory Commission. It's a big pain! If it were possible to just read them directly, that would be much much better.

zaneselvans commented 4 years ago

Oh, haha, no it's not what I've got, unfortunately...

This function allows you decompress a DBC file into its DBF counterpart. Please note that this is the file format used by the Brazilian Ministry of Health (DATASUS), and it is not related to the FoxPro or CANdb DBC file formats.

augusto-herrmann commented 4 years ago

I have found this other Python imlementation that is capable of reading DBC files from SUS, the Brazilian public health system.

So feel free to decide whether to implement the reading of these DBC files in dbfread, looking at their implementation, or not to.

olemb commented 3 years ago

This would be nice to have.

We will first have to add support for reading from a file object (see issue #53) but once that is in place this is something we could definitely look into.

Ochuat commented 8 months ago

It would be very valuable and useful to be able to decompress dbc files directly through this library, using Python.

I have already tested several repositories, libraries and codes to deal with the file extensions used by DataSuS, in Brazil; but I was unsuccessful, all leading me to an error in some component that became old or no longer existed.

It is possible to go the simpler route of just unpacking dbc to dbf; or to csv or pandas dataframe (I believe internally it is more viable without converting to dbf first).

But even without that, your creation is very valuable and helps me in my work. Congratulations and thank you.