Open GoogleCodeExporter opened 9 years ago
Commited r5313.
With this change the example call:
testWikipediaENAPI("Wikipedia:Hauptseite/Artikel_des_Tages/Montag",
"http://de.wikipedia.org/w/api.php", Locale.GERMAN);
creates at least the HTML file and downloads the referenced image file.
To avoid caching the template files you can copy (don't derive) your own wiki
model from the APIWikiModel and override the getRawWikiContent() method and
eliminate the usage of the Derby database.
If a template name is requested in your getRawWikiContent() method you don't
use fWikiDB.selectTopic() but your own static files.
So you hould be able to eliminate the dependency from Derby database.
Original comment by axelclk@gmail.com
on 26 May 2012 at 2:42
Wow, thank you for the fast support. :-)
I have successfully eliminated the downloading of images and the usage of the
Derby database by deriving an "own" wiki model as you suggested.
But the User object is still required to let the model load the templates and
article via the URL, itself. Do you have an idea how this can be eliminated? At
the moment I parse the raw wiki text itself and let bliki just converting the
article text. But my self-implemented parsing logic of raw wiki text is very
complicated and can not handle all situations. Therefore I would like to use
bliki also for parsing the raw wiki text. From the current parsing and
rendering I have a lot of unit tests which test the final rendering with
various contents. These tests can not be switched to the new bliki integration,
because I can only call bliki with a URL, but can not inject the pre-loaded raw
wiki text as a string. The unit tests should of course not load the text via
the internet. This is also not possible, because the content behind the URL of
the article of the day is changed weekly. ;-) So I would need a way to
initialize the wiki model with a pre-loaded raw wiki text as a string or
InputStream or I need a way to mock the remote call for loading the article. I
could not yet find a way. I tried to replace "List<Page> thePages =
myUser.queryContent(thePageTitles)" within "getRawWikiContent" with
"XMLPagesParser theParser = new XMLPagesParser(theRawWikiTextAsString).parse;
List<Page> thePages = theParser.getPagesList();", but could not yet get it
working because this method is also used to load templates.
Thank you in advance
Regards,
Sven S.
Original comment by sven.strohschein@googlemail.com
on 26 May 2012 at 10:11
Attachments:
I committed r5377.
With these new methods:
DocumentCreator#renderToFile(String rawWikiText, String title, ITextConverter
converter, String filename) throws IOException;
HTMLCreatorExample#testWikipediaText(String rawWikiText, String title, Locale
locale);
you can render a wiki text snippet directly into a file.
This is a quick and dirty solution.
You should copy DocumentCreator to your own class and delete/refactor the
things you don't need.
If possible please contribute back your finished solution, so that other users
can also use your Creator and WikiModel classes.
Original comment by axelclk@gmail.com
on 28 May 2012 at 9:48
Hi,
it is almost done and I will post it or provide a patch when it is ready. One
thing regarding the article image is strange. The example at the bottom
contains the image name/reference "Datei:Nyatapole2.jpg", but when I convert it
to HTML with bliki, it results in "Datei:116px-Nyatapole2.jpg". The image size
is appended to the filename which isn't correct. The concrete image can be
found via
"http://de.wikipedia.org/w/api.php?action=query&titles=Datei:Nyatapole2.jpg&prop
=imageinfo&iiprop=url&format=xml", but not with the bliki-modified image name:
"http://de.wikipedia.org/w/api.php?action=query&titles=Datei:116px-Nyatapole2.jp
g&prop=imageinfo&iiprop=url&format=xml".
Do you have an idea why this is happening and how it can be avoided?
Example
<?xml version="1.0"?><api><query><normalized><n
from="Wikipedia:Hauptseite/Artikel_des_Tages/Donnerstag"
to="Wikipedia:Hauptseite/Artikel des
Tages/Donnerstag"/></normalized><pages><page pageid="964888" ns="4"
title="Wikipedia:Hauptseite/Artikel des Tages/Donnerstag"><revisions><rev
xml:space="preserve">{{Shortcut|WP:ADTDO}}{{Wikipedia:Hauptseite/Artikel des
Tages/Bearbeitungshinweise}}
<onlyinclude> {{AdT-Vorschlag
| DATUM = 28.07.2011
| LEMMA = Bhaktapur
| BILD = Datei:Nyatapole2.jpg
| BILDBESCHREIBUNG = Nyata-Tempel, 1708 erbaut, dreißig Meter hoch und der
hinduistischen Gottheit Lakshmi geweiht
| BILDGROESSE = 116px
| BILDUMRANDUNG =
| TEASERTEXT = '''[[Bhaktapur]]''' (nepali ??????? ‚Stadt der Frommen‘)
oder ''Khwopa'' (newari ???? ''Khvapa'') ist neben Kathmandu und Lalitpur mit
über 78.000 Einwohnern die dritte und kleinste der Königsstädte des
Kathmandutals in Nepal. Bhaktapur liegt am Fluss Hanumante und wie Kathmandu an
einer alten Handelsroute nach Tibet, was für den Reichtum der Stadt
verantwortlich war. Das Bild der Stadt wird bestimmt von der Landwirtschaft,
der Töpferkunst und besonders von einer lebendigen traditionellen
Musikerszene. Wegen seiner über 150 Musik- und 100 Kulturgruppen wird
Bhaktapur als Hauptstadt der darstellenden Künste Nepals bezeichnet. Die
Einwohner von Bhaktapur gehören ethnisch zu den Newar und zeichnen sich durch
einen hohen Anteil von 60 Prozent an Bauern der Jyapu-Kaste aus. Die Bewohner
sind zu fast 90 Prozent Hindus und zu zehn Prozent Buddhisten. Vom 14.
Jahrhundert bis zur zweiten Hälfte des 18. Jahrhunderts war Bhaktapur
Hauptstadt des Malla-Reiches. Aus dieser Zeit stammen viele der 172
Tempelanlagen, der 32 künstlichen Teiche und der mit Holzreliefs verzierten
Wohnhäuser. Zwar verursachte ein großes Erdbeben 1934 viele Schäden an den
Gebäuden, doch konnten diese wieder so instand gesetzt werden, dass Bhaktapurs
architektonisches Erbe bereits seit 1979 auf der UNESCO-Liste des
Weltkulturerbes steht.
}} </onlyinclude>
[[Kategorie:Wikipedia:Hauptseite/Artikel des
Tages|Donnerstag]]</rev></revisions></page></pages></query></api>
Original comment by sven.strohschein@googlemail.com
on 31 May 2012 at 6:52
I'm appending the width with the "iiurlwidth" parameter like this
http://de.wikipedia.org/w/api.php?action=query&titles=Datei:Nyatapole2.jpg&prop=
imageinfo&iiprop=url&format=xml&iiurlwidth=116
See the example I've commited: r5528
See the info.bliki.wiki.impl.APIWikiModel#appendInternalImageLink() method for
details;
http://code.google.com/p/gwtwiki/source/browse/trunk/info.bliki.wiki/bliki-pdf/s
rc/main/java/info/bliki/wiki/impl/APIWikiModel.java
Original comment by axelclk@gmail.com
on 7 Jun 2012 at 7:02
Hm, I tried to overwrite appendInternalImageLink, but the call parameters have
already the "extended" image filename. Therefore appendInternalImageLink can
not cause the magic extension.
hrefImageLink = "Datei:116px-Nyatapole2.jpg"
srcImageLink = "116px-Nyatapole2.jpg"
Original comment by sven.strohschein@googlemail.com
on 7 Jun 2012 at 9:58
Hi,
I have created a new "in-memory" APIWikiModel along with an example and another
modification to the DocumentCreator. Everything is contained within the
attached patch (SVN). Is it possible to apply and commit this patch?
Regards,
Sven S.
Original comment by sven.strohschein@googlemail.com
on 20 Aug 2012 at 7:35
Attachments:
I added your patch with this commit: r6831.
Original comment by axelclk@gmail.com
on 21 Aug 2012 at 9:32
Hi,
thanks for adding the patch.
I detected two new problems which I have fixed with another patch. Could you
please also add this patch?
1. Problem: When the image file has a SVG extension, the extension is changed
from ".svg" to ".svg.png" by the WikiModel. This behavior isn't desired in the
in-memory model, because it breaks the image URL. I added a quick-fix like with
the file-size extensions I described above. This should be improved in the
future for example by override possibilities of the WikiModel.
2. Problem: I had the problem that an article image weren't detected, because
the prefix/namespace check for images does not work sometimes.
INamespace#getImage() returned "Datei" (german locale for "File") and
INamespace#getImage() returned "Image", but the article contained "File" (not
localized). So these three prefixes should get checked, because some article
requests return "Datei" and some other articles return "File".
Original comment by sven.strohschein@googlemail.com
on 1 Sep 2012 at 11:21
Attachments:
Added you patch with commit r6896.
Original comment by axelclk@gmail.com
on 2 Sep 2012 at 8:55
Hi,
I have improved the code again and final. The solution is now more stable (an
error occurred when the original image name contained "-" sign), the code is
now clean (the ToDo could also be solved) and it should be better for the
performance.
Could you please integrate the patch in 3.0.20? I hope it can also get deployed
to Sonatype soon. :-)
Thanks.
Original comment by sven.strohschein@googlemail.com
on 17 Oct 2013 at 9:27
Attachments:
I think the issue can also get marked as fixed when the
in-memory-support-3.patch is applied.
Original comment by sven.strohschein@googlemail.com
on 17 Oct 2013 at 9:29
Committed r9124 and r9125
Original comment by axelclk@gmail.com
on 20 Oct 2013 at 5:14
Original issue reported on code.google.com by
axelclk@gmail.com
on 26 May 2012 at 2:29