sweble / sweble-wikitext

The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaWiki.
http://sweble.org/sites/swc-devel/develop-latest/tooling/sweble/sweble-wikitext
70 stars 27 forks source link

Updating the jars with dependencies on the website #61

Closed wetneb closed 7 years ago

wetneb commented 7 years ago

Hi @hannesd,

I wonder if it would be possible to update http://sweble.org/downloads/swc-devel/master-latest/ with the latest releases?

Thank you so much for maintaining this library!

hannesd commented 7 years ago

Hi!

do you need the jars with dependencies? I'm asking because you could download the "normal" jars from maven as well. We have not maintained the downloads in quite a while. I'll have to look into this.

Valedix commented 7 years ago

Hello, I was intending on sending this message directly to the project owners but apparently you can't do that on Git so I'm posting here. Sorry for going somewhat off-topic.

I'm new to both Git and Maven. Supposing I only want version 3.1.5 and don't need it to keep updating, what would be the best way to get Sweble into my project? So far I've tried a few things, none of which were satisfactory.

At first I found a working example project, which was great as it was succint and all I had to do was to modify it, but it was version 0.1.0. I then downloaded the 2.0.0 jar with dependencies from the same link given above but I wasn't sure whether that was the latest version. The Download page suggested so but the Nexus repositories gave different max versions for different things (tooling, hddiff, hddiff-parent, etc), all of which were higher than 2.0.0.

At last I found this Git, downloaded release 3.1.5, went into Eclipse -> Import -> Maven -> Existing Maven Project. It downloaded what I assume were the dependencies but at the end it created several projects, most of which contained errors and thus could not be run. The errors were for things like imports for classes which were on a different project. Now maybe I could fix those manually but I feel like I'm not going about this the way one is supposed to. I doubt everyone who makes use of this project has to go around moving classes to make the project run so I'd like to ask what is it I am missing.

Under this current version, is the recommended way of geting hold of the project (so that another project of mine will be able to parse Wiki dumps and output a modified version) through Git or Maven?

Ps.: thank you for keeping this up. Sorry for the newbishness and the wall of text.

hannesd commented 7 years ago

Sadly the sweble.org page is quite out of date and I guess I should take parts of it down as it only confuses people and I simply do not have the time to maintain it properly.

Nowadays we only really offer the sources through GitHub and jars through maven central (and our own maven repository for snapshots). Unfortunately, not offering jars with dependencies through our downloads page makes life a bit harder for those who do not want to use tools like gradle or maven.

Currently the latest version of the org.sweble.wikitext stuff (which you need for parsing dumps) is 3.1.5 (see https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.sweble.wikitext%22).

We offer three example applications:

All of which can be found in this repo in the develop branch or in one of the tagged releases, e.g.: https://github.com/sweble/sweble-wikitext/tree/sweble-wikitext-3.1.5/sweble-wikitext-components-parent/swc-example-basic

The way you imported the repo into Eclipse is correct as far as I can tell but it's not enough. Some of our libraries use aspectj and some maven plugins that are not supported by Eclipse out of the box. Installing support for those in Eclipse can be quite a challenge. However, the examples themselves should work out of the box in Eclipse.

I propose you only import the examples individually (select only the example's pom when doing the import, not the parent pom of the whole repo) and I would expect things to just work as all the libraries are downloaded by maven. You would still be able to browse the sources of all sweble code since we publish sources jars on maven central and if you configure Eclipse correctly it will download those as well.

Use either version 3.x (unstable, I make breaking changes without changing the major version number) or version 2.x (stable, I only apply bug fixes, but does not have the latest features).

wetneb commented 7 years ago

@hannesd: yes I did need the jars with dependencies - I have started using sweble in OpenRefine (https://github.com/OpenRefine/OpenRefine), which still ships its dependencies with that method. So I had to figure out how to generate jars with dependencies from your source - it is not that hard but I was new to all the java build tools so it would have been much quicker to download them directly.

Thanks a lot for your work!

Valedix commented 7 years ago

Thank you for the in-depth reply. Over the last few days I've tried some different approaches to get the code working but have had no success so far. From only importing individual examples to trying other IDEs to trying different import methods.

In my previous reply I mentioned moving the classes manually but that didn't work neither: project SWC - Sweble Dump Reader, for example, in its class org.sweble.wikitext.dumpreader.TestDumpReader_0_10.java imports org.sweble.wikitext.dumpreader.export_0_10.CaseType and I've been unable to find that one.

I've already installed AspectJ and installed the 3 maven plugins that Eclipse tries to install but fails to (aspect-maven-plugin 1.7, build-helper-maven-plugin 1.8, and maven-jaxb2-plugin 0.8.0). I also get the error:

.git directory could not be found! Please specify a valid [dotGitDirectory] in your pom.xml (pl.project13.maven:git-commit-id-plugin:2.1.12:revision:gather-git-information:initialize)

Which I don't understand since none of the pom files require either git-comit-id-plugin nor pl.project13. This is driving me crazy.

--New--

Alright, I managed to get it "working" by selecting only the topmost pom. Yet when I try to run the dumpcruncher example I get the message "Error: Could not find or load main class." If I try to build the project (through Maven) it gives me an error due to a lack of .git directory somewhere. The project has no Build Path and its icon at the leftside tree is missing the little J indicating that Eclipse doesn't consider it a Java project (it has no .classpath and I don't know how to create a valid one without access to the original .classpath).

Taking all this into consideration, can anyone please give me a rough draft on how you got this project working in a new pc?

Valedix commented 7 years ago

Alright, if I import only the topmost pom as Maven Project, Eclipse recognizes it as a Java Project, shows no errors, and I'd probably be able to run it, except that one of the Maven plugins expects a .git directory so I need to use Git as well.

There are a number of ways to import it as both Git and Maven, but no matter what I've tried, Eclipse won't recognize it as Java Project, it lacks Build Path and .classpath. Through 'mvn dependency::tree' it seems to me that everything is in order but if I try either 'mvn clean install -U' or 'Eclipse->Run as->Maven clean' then surefire fails on swc-parser-lazy with the following errors:

testParsedPrettyPrintedWikitextMatchesOriginal with { "exp-Saxby+Chambliss.wikitext", ... }; Expected in dir: nopkg-complex/pretty-printed.ast

testAstAfterPostprocessingMatchesReferenceAst with { "exp-Saxby+Chambliss.wikitext", ... }; Expected in dir: nopkg-complex/after-postprocessing.ast

Do I need to manually create these .ast files? At the xml files on surefire-reports folder there is also:

de.fau.cs.osr.utils.visitor.VisitingException: java.lang.OutOfMemoryError: Java heap space But I don't know whether that is created by the previous error.
hannesd commented 7 years ago

Please open a separate issue where we can discuss these problems. This issue is really about something else...

Assuming that you are checking out either master or develop I do not understand why you would get those errors. Both branches are built daily by Jenkins and "it just works". Jenkins doesn't do more than "git clone" and "mvn install".

Please take a look at https://osr.cs.fau.de/software/sweble-wikitext/simple-parser-example/ and let me know if that helps. I wrote that tutorial yesterday and also executed those steps successfully.

Valedix commented 7 years ago

Outstanding! Thank you very much. I didn't create a new issue because I thought of these as being used for bugs or feature requests and not "Ive got no idea what Im doing. Plz halp!", Sorry about that. If you wish I can delete some of my previous posts or you can move them elsewhere and close them as "solved", whatever works for you. Thanks again.

hannesd commented 7 years ago

No problem, we'll leave it as it is.

wetneb commented 7 years ago

Hi @hannesd, unless I am mistaken, this issue was not solved as http://sweble.org/downloads/swc-devel/master-latest/ still contains outdated files. I can open a separate issue if this one changed focus.

hannesd commented 7 years ago

Sorry, I've closed the issue prematurely. My proposed solution is to extend the maven build to also build a -jar-with-dependencies which will then be uploaded to maven central as well. From there you can download it. The http://sweble.org/downloads page will be taken offline as I cannot maintain it any longer.

Which version of sweble are you using and what libraries do you need? I will add code to create -jar-with-dependencies for swc-parser-lazy, swc-engine and swc-dumpreader version 3.x. Does that cover your use case?

wetneb commented 7 years ago

@hannesd that would be fantastic!

hannesd commented 7 years ago

Jars with dependencies are now available on maven central: https://osr.cs.fau.de/software/sweble-wikitext/sweble-wikitext-maven-artifacts/#jars-with-dependencies

The blog has been moved to our professorship's blog.