Open cbaggers opened 8 years ago
Possible method: Mechanical Turk
We use Amazon's mechanical turk to get people to do this painful job. We can make a system to bring up the doc page and instruct the agents how to find the information we need.
Is the info on https://mesamatrix.net useful ?
Wow! thanks, that's a kickass link but sadly I don't think is covers the details we need. It's stuff like this:
http://docs.gl/el3/equal where there are different ES versions for different signatures. Just munging those tables into lists would be a massive win. Hmm I had a quick play with a html_table->json site and it turned that table into:
[
["Function Name","1.00","3.00","3.10" ],
["equal (vec)","✔","✔","✔" ],
["equal (bvec)","✔","✔","✔" ],
["equal (ivec)","✔","✔","✔" ],
["equal (uvec)","-","✔","✔" ]
]
Which is totally workable. So we would need to:
After that we need to expand those 'generic' types, which is annoying but tractable and also fix all the edge cases by hand (like when they dont have the proper signature with the versions..ugh.)
I wont have time to do this for a while, but at least there is a path to something useful
p.s. for an example of stuff that sucks, `distance has this signature:
float distance(genType p0, genType p1)
but this in the version info it's distance (genType)
.
This case is a simple one as the type string matches, they get plenty worse
to be fair ^^^ was pretty much my approach for the work I've already done and it was a rage inducing experience. Ah well, I or someone else will bite the bullet at some point :p
Yeah I've done plenty of sanitising of user-generated data before, you end up with a semi-manual process usually.
OTOH, a large chunk of the errors will probably be classifiable into certain types.
There is also the piglit opengl test suite - not sure if it could be tangentially relevant to this.
just a tiny update here as I found a nicer method of dealing with the gl docs' html..dont
for file in *xhtml; do w3m -dump "$file" > "$file.out.txt"; done
Render's the html as text and dumps to disk. Visually the layout of the pages is fairly consistent and this let's us grep more easily (which frankly would have been a better approach for getting all the signatures).
It still leaves expanding the gtypes but this is fairly scriptable (fairly as the docs can get quite creative in their uses of this)
Update all definition to also include whether each definition is available in ES & Core.
This is hard as what is/isn't in core doesnt exist in one place, it's mainly dotted through man pages or the pdf specs.
We need to work out a methodology and get this going