Closed mawiesne closed 6 years ago
This seems to be a regression introduced with the changes of #152 and #155 .
@tgalery as you contributed the changes of #155, can you also have a look into this issue?
I can confirm, that this also affects Windows 10 - Stacktrace is similar to the one posted by @mawiesne in a Java 8 environment (Oracle / OpenJDK does not matter)
@mawiesne No idea. I hope @tgalery maybe has some insight.
can someone post a bit of code that generates the stacktrace above ?
import de.tudarmstadt.ukp.wikipedia.api.Page;
import de.tudarmstadt.ukp.wikipedia.api.Wikipedia;
import de.tudarmstadt.ukp.wikipedia.api.exception.WikiApiException;
public class Main {
public static void main(String[] args) throws WikiApiException {
Wikipedia wikipedia = new Wikipedia(new CustomDataSource("host", "dbname", "user", "password", "com.mysql.jdbc.Driver", false));
//German Wikipedia for example, page with title "Gesundheit"
Page page = wikipedia.getPage("Gesundheit");
//Exception will be thrown...
page.getPlainText();
}
}
with this implementation as CustomDataSource
:
import de.tudarmstadt.ukp.wikipedia.api.DatabaseConfiguration;
import de.tudarmstadt.ukp.wikipedia.api.WikiConstants;
import de.tudarmstadt.ukp.wikipedia.api.WikiConstants.Language;
import org.slf4j.Logger;
import java.sql.*;
public class CustomDataSource extends DatabaseConfiguration {
private static final Logger logger = org.slf4j.LoggerFactory.getLogger(CustomDataSource.class);
private String jdbcURL;
private String databaseDriver;
/*
* needed to please frameworks like Spring... parameter injection is done
* via setters there
*/
public CustomDataSource() {
super();
}
public CustomDataSource(String hostName, String dbName, String user, String password, String driverClassName, boolean useSSL) {
this();
setDbName(dbName);
setHostName(hostName);
setPassword(password);
setUserName(user);
// check if the DB driver is available in the classpath
try {
Class.forName(driverClassName);
} catch (ClassNotFoundException e) {
logger.error(e.getLocalizedMessage(), e);
throw new RuntimeException(e.getLocalizedMessage(), e);
}
String baseJdbcURL = "jdbc:mysql://" + getHostName() + "/" + getDbName();
if(!hasExternalSSLParams(baseJdbcURL)) {
if (useSSL) {
setJdbcURL(baseJdbcURL + "?verifyServerCertificate=false&useSSL=true");
} else {
setJdbcURL(baseJdbcURL + "?useSSL=false");
}
} else {
setJdbcURL(baseJdbcURL);
}
Language lang = requestWikiLangFromDB(hostName, dbName, user, password);
setLanguage(lang);
}
private boolean hasExternalSSLParams(String baseJdbcURL) {
return baseJdbcURL.contains("useSSL=");
}
/*
* Although the JWPL-DataBase knows it's Wikipedia language (described as
* <code>language</code> in the table <code>MetaData</code>), the
* {@link DatabaseConfiguration} needs to know the specified
* {@link Language}. Hence, it will be requested by this method so the user
* does not have to configure the {@link Language} manually.
*
* @param hostName
* @param dbName
* @param user
* @param password
* @return the language found in the <code>MetaData</code>-table, as
* enumeration instance of {@link Language}
* @throws WikiServiceException
*/
private Language requestWikiLangFromDB(String hostName, String dbName, String user, String password) {
try (Connection connection = DriverManager.getConnection(getJdbcURL(), user, password)){
Statement stmnt = connection.createStatement();
ResultSet result = stmnt.executeQuery("Select language from MetaData");
if (result.next()) {
String languageString = result.getString(1);
logger.info("The language found at {}:{} is '{}' and will be set to this Wiki-DB connection", hostName, dbName, languageString);
if (languageString.equals("türkçe")) {
languageString = "turkish";
}
return WikiConstants.Language.valueOf(languageString);
} else {
throw new RuntimeException("No language could be found for this Wikipedia DB. This is very strange, check your DB setup!");
}
} catch (SQLException e) {
logger.error(e.getLocalizedMessage());
throw new RuntimeException(e);
}
}
public void setDbName(String dbName) {
assert dbName!=null;
assert dbName.trim().length() > 0;
super.setDatabase(dbName);
}
public String getDbName() {
return super.getDatabase();
}
public void setHostName(String hostName) {
assert hostName!=null;
assert hostName.trim().length() > 0;
super.setHost(hostName);
}
public String getHostName() {
return super.getHost();
}
public String getUserName() {
return super.getUser();
}
public void setUserName(String user) {
assert user!=null;
assert user.trim().length() > 0;
super.setUser(user);
}
/**
* @param databaseDriver the databaseDriver to set
*/
public void setDatabaseDriver(String databaseDriver) {
assert databaseDriver!=null;
assert databaseDriver.trim().length() > 0;
this.databaseDriver = databaseDriver;
}
public String getDatabaseDriver() {
return databaseDriver;
}
/**
* @param jdbcURL the jdbcURL to set
*/
public void setJdbcURL(String jdbcURL) {
assert jdbcURL!=null;
assert jdbcURL.trim().length() > 0;
this.jdbcURL = jdbcURL;
}
public String getJdbcURL() {
return jdbcURL;
}
@Override
public String getPassword() {
return super.getPassword();
}
@Override
public void setPassword(String password) {
super.setPassword(password);
}
@Override
public WikiConstants.Language getLanguage() {
return super.getLanguage();
}
@Override
public void setLanguage(WikiConstants.Language language) {
assert language != null;
super.setLanguage(language);
}
}
Will output:
Exception in thread "main" de.fau.cs.osr.utils.visitor.VisitingException: de.fau.cs.osr.utils.visitor.VisitorException: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorBase.handleVisitingException(VisitorBase.java:92)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:118)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:28)
at de.fau.cs.osr.utils.visitor.VisitorBase.go(VisitorBase.java:111)
at de.tudarmstadt.ukp.wikipedia.api.Page.parsePage(Page.java:610)
at de.tudarmstadt.ukp.wikipedia.api.Page.getPlainText(Page.java:591)
at de.hshn.mi.shc.etl.wiki.Main.main(Main.java:19)
Caused by: de.fau.cs.osr.utils.visitor.VisitorException: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:130)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.iterate(AstVisitor.java:66)
at de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter.visit(PlainTextConverter.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at de.fau.cs.osr.utils.visitor.VisitorLogic$Target.invoke(VisitorLogic.java:361)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:110)
... 8 more
Caused by: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorLogic$1.compare(VisitorLogic.java:186)
at de.fau.cs.osr.utils.visitor.VisitorLogic$1.compare(VisitorLogic.java:168)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
at java.util.TimSort.sort(TimSort.java:220)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1462)
at java.util.Collections.sort(Collections.java:175)
at de.fau.cs.osr.utils.visitor.VisitorLogic.findVisit(VisitorLogic.java:167)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:105)
... 19 more
Can you help me understand something in your code ? Looking at the JWPLDataSource
would that connect to a database which contains the relevant wikipages ? The creds look funny to me.
Basically:
1.) Create a connection to a database. In our case: a MySQL DB containing the Wikipedia Dumps and therefore the wikipedia pages.
2.) I left out the real credentials ;)
3.) Retrieve a page of interest (it does not matter which one).
4.) Try to retrieve the full text via getPlainText()
gotcha, sorry for being a pain, cause i use this in the context of json wikipedia. Is the Mysql database populated by downloading and importing sql files from here https://dumps.wikimedia.org/enwiki/20180320/ (if so could you let me know which) or is there a transformation from the full xml dump into sql that is done by some cli tool in advance ?
We make use of the DataMachine
tool, provided by JWPL project, see here: https://dkpro.github.io/dkpro-jwpl/DataMachine/
The resulting files are then imported into a MySQL 5.7 installation.
For a German version of Wikipedia dumps, we basically use:
java -Xmx2g -jar JWPLDataMachine.jar german !Hauptkategorie Begriffsklärung ~/dewiki/$date-of-snapshot$/
as given in the examples section of the how-to.
Cool, could I get the exact command you guys used to produce the german (or any other language) dump (I will try to replicate the bug and see if there's an easy fix).
I updated the code-snippet above to not use internal classes / provided related code to execute it.
@tgalery Thanks a ton for looking into this! I will upload a dump of a transformed version of the German wikipedia DB dating Jan 2018. Stay tuned, next comment with instructions will follow shorty.
@mawiesne that would be extremely helpful
@tgalery Download one or both of the two mysql dumps from here:
Re-Import them on your local dev-machine via:
CREATE DATABASE wikipedia_de_jwpl_Jan2018 CHARACTER SET UTF8;
GRANT ALL ON wikipedia_de_jwpl_Jan2018.* TO username@'%' IDENTIFIED BY "password";
gunzip < wikipedia_de_jwpl_Jan2018.sql.gz | mysql --quick --user=root -p
Same procedure with smaller Spanish (es) version, just exchange 'de' with 'es'. When you decide to use es, you could, for instance, fetch a page such as "Salud".
Cheers, will give you guys an update as soon as I can.
Some upates. I've been trying to debug this using the Spanish dump as it's slightly smaller. But it seems I get an exception instantiating the wikipedia class. I'm using scala and I get the following:
scala> import de.tudarmstadt.ukp.wikipedia.api._
import de.tudarmstadt.ukp.wikipedia.api._
scala> val source = new CustomDataSource("host", "dbname", "user", "password", "com.mysql.jdbc.Driver", false)
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
source: de.tudarmstadt.ukp.wikipedia.api.CustomDataSource = de.tudarmstadt.ukp.wikipedia.api.CustomDataSource@3ac02398
scala> val wikipedia = new Wikipedia(source)
log4j:WARN No appenders could be found for logger (de.tudarmstadt.ukp.wikipedia.api.Wikipedia).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing column [version] in table [MetaData]
at org.hibernate.tool.schema.internal.AbstractSchemaValidator.validateTable(AbstractSchemaValidator.java:136)
at org.hibernate.tool.schema.internal.GroupedSchemaValidatorImpl.validateTables(GroupedSchemaValidatorImpl.java:42)
at org.hibernate.tool.schema.internal.AbstractSchemaValidator.performValidation(AbstractSchemaValidator.java:89)
at org.hibernate.tool.schema.internal.AbstractSchemaValidator.doValidation(AbstractSchemaValidator.java:68)
at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.performDatabaseAction(SchemaManagementToolCoordinator.java:191)
at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.process(SchemaManagementToolCoordinator.java:72)
at org.hibernate.internal.SessionFactoryImpl.<init>(SessionFactoryImpl.java:312)
at org.hibernate.boot.internal.SessionFactoryBuilderImpl.build(SessionFactoryBuilderImpl.java:462)
at org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:710)
at de.tudarmstadt.ukp.wikipedia.api.hibernate.WikiHibernateUtil.getSessionFactory(WikiHibernateUtil.java:51)
at de.tudarmstadt.ukp.wikipedia.api.Wikipedia.__getHibernateSession(Wikipedia.java:761)
at de.tudarmstadt.ukp.wikipedia.api.MetaData.<init>(MetaData.java:44)
at de.tudarmstadt.ukp.wikipedia.api.Wikipedia.<init>(Wikipedia.java:87)
... 42 elided
Is there something wrong with the spanish dump I downloaded above ?
Commenting out hibernate auto validation gives me this:
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'metadata0_.version' in 'field list'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2281)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:60)
... 60 more
so ... maybe there's something funny with the dump ?
@tgalery I think I know what went wrong, and I'll provide two modified/fresh dumps on next Monday.
UPDATE: Re-Download one of the two files and check sha1sum afterwards:
German version (4.5G):
https://download.mi.hs-heilbronn.de/tulum/wikipedia_de_jwpl_Jan2018.sql.gz
_sha1sum_should match f837788b0fe5c5b564fd22f11213be9d718190f4
Spanish version (2.7G):
https://download.mi.hs-heilbronn.de/tulum/wikipedia_es_jwpl_Jan2018.sql.gz
sha1sum should match dc33b2975e4243217e13658685de2bcf3677975a
Remove all previous files / imported DBs and conduct a re-import. It should work now as I've dumped it from one of our production systems in which no DB schema errors are present.
Again, sry for any inconveniences.
It seems to be a problem with the reflection code in de.fau.cs.osr.utils.visitor.VisitorLogic
, which cannot differentiate between the correct visit
methods at runtime.
Line 361ff
public Object invoke(VisitorInterface<?> visitor, Object node)
throws IllegalArgumentException,
IllegalAccessException,
InvocationTargetException
{
touch();
return method.invoke(visitor, node);
}
Both classes
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
extend the same interface classes, which leads to this issue.
At Heilbronn University Group we managed to reproduce this bug with the existing test-cases PageTest#testPlainText()
and the test-DB provided in #2, see
org.junit.internal.AssumptionViolatedException: got: <de.fau.cs.osr.utils.visitor.VisitingException: de.fau.cs.osr.utils.visitor.VisitorException: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
>, expected: null
at org.junit.Assume.assumeThat(Assume.java:95)
at org.junit.Assume.assumeNoException(Assume.java:142)
at de.tudarmstadt.ukp.wikipedia.api.PageTest.testPlainText(PageTest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: de.fau.cs.osr.utils.visitor.VisitingException: de.fau.cs.osr.utils.visitor.VisitorException: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorBase.handleVisitingException(VisitorBase.java:92)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:118)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.iterate(AstVisitor.java:66)
at de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter.visit(PlainTextConverter.java:346)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at de.fau.cs.osr.utils.visitor.VisitorLogic$Target.invoke(VisitorLogic.java:361)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:110)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.iterate(AstVisitor.java:66)
at de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter.visit(PlainTextConverter.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at de.fau.cs.osr.utils.visitor.VisitorLogic$Target.invoke(VisitorLogic.java:361)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:110)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:28)
at de.fau.cs.osr.utils.visitor.VisitorBase.go(VisitorBase.java:111)
at de.tudarmstadt.ukp.wikipedia.api.Page.parsePage(Page.java:610)
at de.tudarmstadt.ukp.wikipedia.api.Page.getPlainText(Page.java:591)
at de.tudarmstadt.ukp.wikipedia.api.PageTest.testPlainText(PageTest.java:98)
... 23 more
Caused by: de.fau.cs.osr.utils.visitor.VisitorException: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:130)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
at de.fau.cs.osr.ptk.common.AstVisitor.iterate(AstVisitor.java:66)
at de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter.visit(PlainTextConverter.java:210)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at de.fau.cs.osr.utils.visitor.VisitorLogic$Target.invoke(VisitorLogic.java:361)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:110)
... 53 more
Caused by: vClass: de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter
nClass: org.sweble.wikitext.parser.nodes.WtText
Candidate 1: visit(org.sweble.wikitext.parser.nodes.WtNode)
Candidate 2: visit(de.fau.cs.osr.ptk.common.ast.AstText)
at de.fau.cs.osr.utils.visitor.VisitorLogic$1.compare(VisitorLogic.java:186)
at de.fau.cs.osr.utils.visitor.VisitorLogic$1.compare(VisitorLogic.java:168)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
at java.util.TimSort.sort(TimSort.java:220)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1462)
at java.util.Collections.sort(Collections.java:175)
at de.fau.cs.osr.utils.visitor.VisitorLogic.findVisit(VisitorLogic.java:167)
at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:105)
... 64 more
CI did not complain because of #161
Cool, I'm assuming this will be reproducible once #162 gets merged ?
Yes
@tgalery Any updates here? :)
@rzo1 @tgalery Seems, I found a fix for this issue locally. I will push a branch and open a PR, once the related test case works as expected.
Finally fixed via PR #185
With the introduction of Swebble 3.1.7 to the JWPL 1.2.0-SNAPSHOT line, I can no longer fetch plain text data from Wikipedia backends via
Page.getPlainText
. The stacktrace is documented here:It seems there is a mismatch of method signatures and/or incompatible libraries being used at runtime. I consider this a major bug, as parts of the main functionality are affected. Therefore, this bug should be fixed before releasing JWPL 1.2.0 (Final).
Dependencies involved:
System environment:
Any ideas @ferschke / @reckart ? Can somebody contact the colleagues at FAU Erlangen to investigate this issue?