Open GoogleCodeExporter opened 8 years ago
What you mean by "supporting"? You mean being able to point gw to a wikipedia
page containing a table and
create a grid from it?
Original comment by stefano.mazzocchi@gmail.com
on 21 May 2010 at 5:32
This is probably best supported by an independent bookmarklet. Gridworks itself
should
support creating a project by pointing to a data file URL and by pasting in raw
text.
Original comment by dfhu...@gmail.com
on 21 May 2010 at 5:41
Original comment by stefano.mazzocchi@gmail.com
on 21 May 2010 at 5:43
An html table importer would also be useful. Adapting the xml importer to deal
with
<th>, <tr> and <td> would be a start.
Original comment by iainsproat
on 21 May 2010 at 5:46
I'm open concerning implementation as long as it preserves the Wikipedia link
and uses
it to resolve to an exact Freebase topic without human intervention.
Possibilities that come to mind include:
- allowing an HTML table cut from a web page to be pasted into Gridworks
- recognizing Excel / Open Office spreadsheets which contain Wikipedia links
Bonus points for the least number of manual steps to produce useful results.
Original comment by tfmorris
on 21 May 2010 at 5:50
A possibility is to convert table content in csv. A bookmarklet for that:
http://table2csv.zeusi.user.dev.freebaseapps.com/index
One problem with wikipedia links is that blue and red links are mixed. If we
convert
the good ones, we will get in the same column ids and names, but we should be
able to
reconcile them separately with facets.
Original comment by antonio....@gmail.com
on 25 May 2010 at 10:12
Original comment by iainsproat
on 14 Oct 2010 at 5:02
I figured out where HTML links are stored in OO Calc, so I may be able to
easily add the ability to optionally convert linked cells in a value + link
pair (or even convert the Wikipedia link to a properly escaped Freebase key).
Hmmm, thinking out loud, a general HTML link -> Freebase key function in Refine
which was functionally the same as the old web client link parser could be very
useful. I think all the URI templates are still available even though they
aren't being used (on input) any more.
Another peculiarity of Wikipedia tables that I just discovered the other day is
the use of <span style="display:none"> elements as sort helpers. I don't know
about Excel, but OO Calc can't handle this at all.
A typical usage (from memory) might be something like
<span style="display:none">0000123456000000</span>1,234.56
where an invisible, zero padded, fixed point, numeric only string is created so
that it will collate properly using an alpha sort which mimics the numeric
sort. Unfortunately the few tools that I tried ignored the styling and munged
the two strings together making them pretty much useless without manual cleanup.
Not sure what can be done about that one.
Original comment by tfmorris
on 14 Oct 2010 at 6:25
Original comment by tfmorris
on 18 Sep 2012 at 3:16
Original issue reported on code.google.com by
tfmorris
on 21 May 2010 at 5:25