pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.87k stars 18.02k forks source link

BUG: xlim and ylim not restricting plot area #40781

Open regmibijay opened 3 years ago

regmibijay commented 3 years ago

Code Sample, a copy-pastable example

df = pd.read_csv(myfile) #contains 2 columns, Time in ms and Voltage in mV
xlabel = "Time in ms"
ylabel = "Volate in mV"
xlim = [0,120]
ylim = [0,8]
df.plot(x = "Time", xlabel= xlabel, ylabel=ylabel, xlim = xlim, ylim = ylim, kind="line")
plt.show()

Problem description

I am plotting a large dataset using dataframe.plot() in pandas. Dataset contains data in csv format. As per documentation, I specify xlimand ylim as arg in df.plot. Now the axes take the xlim and ylim values accordingly but the figure does not scale to these values and plot shows some part of of graph but not the area defined by 'xlim' and 'ylim'.

without xlim :
image with xlim set :
image sample files here: samples project: project

Expected Output

I want to plot only the specified area. In my example image it would be the spike.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : f2c8480af2f25efdbd803218b9d87980f416563e python : 3.9.2.final.0 python-bits : 64 OS : Linux OS-release : 4.4.0-19041-Microsoft Version : #488-Microsoft Mon Sep 01 13:43:00 PST 2020 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.2.3 numpy : 1.20.2 pytz : 2021.1 dateutil : 2.8.1 pip : 21.0.1 setuptools : 52.0.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.3 IPython : 7.22.0 pandas_datareader: None bs4 : 4.9.3 bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.4.1 numexpr : None odfpy : None openpyxl : 3.0.7 pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : 1.6.2 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
rhshadrach commented 3 years ago

I think it should be ylim=[0, 8e-6]. Although I'm confused on the y-axis being labeled by 1e-6 in your second plot.

rhshadrach commented 3 years ago

I'm not able to duplicate on master; with ylim=[0, 8] or ylim=[0, 8e-6]. For the former, I get a line indecipherable from y=0; for the latter I get the expected output:

image

regmibijay commented 3 years ago

after comment from @rhshadrach, I was successful in debugging what was happening. I was processing command line parameters with argparse and this behaviour seems to occur when the supplied xlim and ylim data have string elements and not int or float type. We should maybe either raise ValueError or convert passed data to int or float.

rhshadrach commented 3 years ago

Thanks for reporting back @regmibijay - I agree and think either are a workable solution, although I personally would lean toward raising if the values are not numeric. Would you be interested in submitting a PR to implement the patch?

regmibijay commented 3 years ago

I am working on this issue and will address with a PR soon!