OpenDataHK / hansard

Archive of Hong Kong Legco Votes
0 stars 0 forks source link

Archive of Hong Kong Legco Votes #1

Open siuying opened 11 years ago

siuying commented 11 years ago

any idea on how to do this?

alanho commented 11 years ago

i afraid i overcomplicated this. the first question would be how to convert old record into machine readable one. either we do OCR/image processing.. and for future votes, i'm in touch some contacts that have access to LegCo Secretariat to see if they can improve the system to store machine readable format instead of pdf scannings

sammyfung commented 11 years ago

It is difficult to use OCR to process those scanned records into machine readable one. For LegCo meeting minutes, I checked few days ago and found that all scanned records (minutes only) from 1985 are replaced, now most of them are in 'machine readable' PDFs.

And Charles Mok said on Saturday in HKLUG/ITFest seminar that LegCo Secretariat is preparing machine readable copies of voting results, but we dunno when will be completed.

siuying commented 11 years ago

good news! :+1:

alanho commented 11 years ago

any links to those machine readable PDFs?

it doesn't need to be full or perfect OCR, might be a simple histogram will do the job? it's either yes, no or abstain.

but if somebody will let us know LegCo will convert old data to machine readable one, we could just wait for a bit. if not, maybe someone could get in touch with a computer vision lecturer/professor in local U and get this task as an assignment for students ;D

siuying commented 11 years ago

@alanho http://www.legco.gov.hk/yr12-13/chinese/counmtg/motion/mot_1213.htm#toptbl it will need some smart algorithm to extract them into structured data, though.

alanho commented 11 years ago

so far.. this is all i have time for.. a script to extract all voting result into images or PDFs..

https://gist.github.com/alanho/5433521

siuying commented 11 years ago

not very usful if we just got the image and need manual intervention. Perhaps just write a scraper to extract data from the app: https://itunes.apple.com/hk/app/yi-yuan-biao-xian-lu/id549783193?mt=8 where they are manually input data

alanho commented 11 years ago

interesting app. didn't know about this app! but i guess they don't have all the voting result do they??

alanho commented 11 years ago

the target is to build a database that these guys can use, so they don't have to worry about data input, just focus on presenting the data in their own ways

siuying commented 11 years ago

I think they have all voting result in the period, but only 2008-2012