hupili / python-for-data-and-media-communication-gitbook

An open source book on Python tailed for communication students with zero background
115 stars 62 forks source link

Pyecharts cannot display figures (pyecharts xAxis.type=category error) #104

Open CathyChang1996 opened 5 years ago

CathyChang1996 commented 5 years ago

Troubleshooting

Describe your environment

Describe your question

the line chart cannot display the figures.

The minimum code (snippet) to reproduce the issue

import csv
from dateutil.parser import parse
df = pd.read_csv('Billboard Top1 1958-2018.csv')
df['datetime']=df['date'].apply(parse)
describe_song=df.set_index('datetime').resample('1y')['song'].describe()
describe_song.head()

hhh=pd.DataFrame(pd.to_numeric(describe_song['unique']))
years = list(hhh.reset_index()['datetime'].apply(lambda x: x.year).values)
years
values = list(hhh.unique.values)
values
from pyecharts import Line

line = Line("Unique change trend")
line.add("Unique Songs", years, values, mark_line=["average"], mark_point=['max','min'])
line

What is the closest answer you can find?

an applicable code I used before:

af = pd.read_csv('The Year End Chart.csv') 
YearEnd_counts=af['Artist'].value_counts()[:10].sort_values(ascending=False) 
YearEndartists = pd.DataFrame(YearEnd_counts)

attr1 = YearEndartists.index
v11 = YearEndartists.Artist
bar2 = Bar("Top 10 Year End Aritists")
bar2.add("Year End Artists", attr1, v11, mark_line=["average"], is_label_show=True)
bar2

The csv documents are in my assignment2 folder in 'python-data-assignments' repo

hupili commented 5 years ago

@CathyChang1996 , can you give the direct link of notebooks and data files?

CathyChang1996 commented 5 years ago

notebook link:

https://github.com/CathyChang1996/python-data-assignments/blob/master/assignment2/sample.ipynb

csv documents:

https://github.com/CathyChang1996/python-data-assignments/blob/master/assignment2/Billboard%20Top1%201958-2018.csv

https://github.com/CathyChang1996/python-data-assignments/blob/master/assignment2/The%20Year%20End%20Chart.csv

hupili commented 5 years ago

That is the bug behind pyecharts. In short, pyecharts assume you use categorical data for xAxis. Following are the troubleshooting steps. Workarounds will be come in followup posts.

We need to come up with a minimum reproducible example:

from pyecharts import Line

line = Line("Unique change trend")
line.add("Unique Songs", [1990, 2000, 2010], [10, 30, 20]) #, mark_line=["average"], mark_point=['max','min'])
line.render('test.html')

The one in OP is not minimum: it combines loading real data with pyecharts. Now we try to fit in some artificial data and there is still error. The above example outputs the chart into test.html.

We check it out:

screenshot 2018-11-23 at 11 50 34 pm

It shows that data is correctly loaded in the browser. So we can only expect the wrong use of other eCharts options. One can test this on echarts online testbed.

screenshot 2018-11-23 at 11 50 41 pm

There is blank chart. If the official echarts online testbed can not show the chart, pyecharts can not , either. That is because pyecharts is just a Python binding, like a translator. echarts does the real work in the background.

Since the original line chart in the gallery works, we can use it for further testing. By changing the series.data from 820 to [5, 820], we find that 5 does not correspond to the 5 in the xAxis.

screenshot 2018-11-23 at 11 56 37 pm

Further test confirms the observation

screenshot 2018-11-23 at 11 56 43 pm

The 5 and 3 above do not refer to the value in the data space. They refer to the index of xAxis. The reason is because pyecharts assumed categorical data here.

screenshot 2018-11-23 at 11 58 17 pm

By changing the category to value, we solve the problem.

screenshot 2018-11-23 at 11 59 59 pm

This does not look good because the xAxis range is properly set. Let's set min and max to solve this issue.

screenshot 2018-11-24 at 12 05 16 am

Above is the solution if you know Javascript.

hupili commented 5 years ago

There is no option from pyecharts to change the type from "category" to "value".

From the official doc

The function signature (parameter list)

screenshot 2018-11-24 at 12 17 03 am

All the samples use categorical data.

screenshot 2018-11-24 at 12 17 07 am

Actually, the above is not a typical (correct) use of "line". The above is a typical case for bar chart.

Now we have two options:

  1. Use categorical data in xAxis -- Only works if the original data is uniform
  2. Modify the code of pyecharts
hupili commented 5 years ago

Option 1 seems Ok in this case because your original years list is uniform. Note that the graph will be distorted if the year is not uniform/ consecutive.

Solution:

from pyecharts import Line

line = Line("Unique change trend")
line.add("Unique Songs", [str(y) for y in years], values) #, mark_line=["average"], mark_point=['max','min'])
line

Result:

screenshot 2018-11-24 at 12 21 55 am
hupili commented 5 years ago

Here is the Option 2 -- hack pyecharts.

from pyecharts import Line

line = Line("Unique change trend")
line.add("Unique Songs", years, values) #, mark_line=["average"], mark_point=['max','min'])
xAxis = line._option['xAxis'][0]
xAxis['type'] = 'value'
xAxis['min'] = min(years)
xAxis['max'] = max(years)
line
screenshot 2018-11-24 at 12 31 04 am

NOTE: this looks the same as above. That is just a co-incidence. When your year column is not consecutive, this version is correct. The above will give a wrong visualisation.

hupili commented 5 years ago

Here are all the data and codes and troubleshooting notes: https://github.com/hupili/python-for-data-and-media-communication/tree/master/pyecharts-examples/debug-echarts-time-series

[DONE]

CathyChang1996 commented 5 years ago

OMG !!! It helps me a lot!!! Really got a great lesson about Pyecharts and how to solve a problem like this! šŸ™ ā›½ļø šŸŒŸ

CathyChang1996 commented 5 years ago

Pyecharts has upgraded its version to 1.0.0 and it added new functions to set diverse options, such as legend and axis. To solve this problem, we can use the following code:


import pandas as pd
import csv
import datetime
from dateutil.parser import parse
from pyecharts import options as opts
df = pd.read_csv('Billboard Top1 1958-2018.csv')
df['datetime']=df['date'].apply(parse)
describe_song=df.set_index('datetime').resample('1y')['song'].describe() 
hhh=pd.DataFrame(describe_song['unique'])

line = (
    Line()
    .add_xaxis(hhh.index)
    .add_yaxis("Unique songs", hhh['unique'])
    .set_global_opts(xaxis_opts=opts.AxisOpts(type_="time"))
)
line.render_notebook()