CorrelAid / pystatis

MIT License
6 stars 2 forks source link

User feedback #49

Open chesselingfm opened 3 months ago

chesselingfm commented 3 months ago

First of all: Thank you for all the work you have put into this! The API wrapper is very helpful.

I did some test driving and came across some issues. Maybe it's a personal problem or I haven't read the doc properly - forgive me for that ;)

1.) I had to pip install it using --no-dependencies, because my current pandas version is much newer (2.2.0) than the dependency pandas<2.0.0,>=1.4.3

2.) trying

tabelle = Table(name='81000-0001') 
tabelle.get_data(str="all") 

Problem: I only get results from 2014 until now - like the web export from Genesis online. But there is data back to the year 1991. How to retrieve it?

3.) The wrapper is downloading data in the German format (commas as decimals etc.) - so pandas doesn't recognize the columns as numbers. We would have to do it manually. Would it be possible to pass arguments similar to pd.read_csv(decimals=",", thousands=".") And have missing values as NaN instead of "-"? If I remember correctly, Michael said something about an English API endpoint that delivers data in a format that works better for pandas.

So, those are my observations, but again: sorry if I was too stupid to find the correct way in the docs and notebooks.

bergnerjonas commented 1 month ago

Hi @chesselingfm , thanks for your feedback! Regarding your questions: 1) The current dependency constraint (pandas = "^2.0") actually dicates a version of pandas >=2.0. Could you try installing the package again, and make sure that you use a fresh virtual environment? 2) You can bypass this by directly passing the startyear argument to your get_data method, such as:

table = Table(name='81000-0001') 
table.get_data(startyear="2000") 

3) You can theoretically pass the parameter "language" to the get_data method. However, this currently returns an error (https://github.com/CorrelAid/pystatis/issues/88). However, missing data is correctly displayed as NaN in the current version of pystatis (0.2.0), regardless of the language parameter - try to run pystatis.__version__ to check which version you are using.