Open vinodkhare opened 12 years ago
Yes, Pie charts would be nice. They are somewhat trickier than others though, since the slices make little sense without the corresponding labels. In case of other plots, the "labels" are simply some other number (ex: x-axis for 2D plots).
I can think of three ways of doing this: 1) Just pick off the slices and label them 1, 2, 3 etc. and let the user decide what is what. 2) Ask the user for the labels. This can get pretty tedious if you have a large chart. 3) Do some text extraction, but it's difficult to make a reliable one. However, this could be useful in many other places as well.
The point is that for simple pie charts, it's quite easy to read off the values anyways. And for large complicated ones, I haven't figured out a way to automate the process. Got any suggestions?
The reason I made this feature request was because I came across a pie chart that had the labels but not the values. Hence, the need for digitization.
I'd prefer option 1 to begin with. Maybe later we could give the user and option to edit the labels.
I think option #1 would work for the vast majority of users - it's pretty easy to manually label in excel since they are naturally in strict order.
@ankitrohatgi
Hey guys, I have created something that might be of interest for you...
https://git.ehtec.co/research/pie-chart-ocr
This python library can extract data from pie charts. It does not do any percentage calculation from sector angles, as I see some other (inaccurate) tools doing it. I don't see the need for that, as every decent pie chart has the percentage numbers next to the sectors. I rarely come across one which doesn't.
The chart is supplied as an image, from which text is extracted via OCR and MSER. Already supported is:
What is planned to be added in the near future:
This project is MIT licensed, feel free to make use of it. I am very busy at the moment, if anybody is interested in this feature and wants to help with the development of this, please let me know. The hardest work is already done.
My email address is elias.hohl@ehtec.co.
Automatically extract percentages from a pie chart.