tada / pljava

PL/Java is a free add-on module that brings Java™ Stored Procedures, Triggers, Functions, Aggregates, Operators, Types, etc., to the PostgreSQL™ backend.
http://tada.github.io/pljava/
Other
244 stars 77 forks source link

URL containing dbf is not supported by java.net.URL #266

Open petrjencek opened 4 years ago

petrjencek commented 4 years ago

Class org.postgresql.pljava.sqlj.Loader creates URL containing protocol "dbf" which is not supported by java.net.URL and therefore when the URL is created by org.postgresql.pljava.sqlj.Loader opening it by java.net.URL causes exception: java.net.MalformedURLException: unknown protocol: dbf

In my case I'm trying to compile Jasper report and the following exception occurs when run within pljava: net.sf.jasperreports.engine.JRException: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 402; schema_reference.4: Failed to read schema document 'dbf://localhost/1057338', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>. at net.sf.jasperreports.engine.xml.JRXmlLoader.loadXML(JRXmlLoader.java:303) at net.sf.jasperreports.engine.xml.JRXmlLoader.loadXML(JRXmlLoader.java:286) at net.sf.jasperreports.engine.xml.JRXmlLoader.load(JRXmlLoader.java:275) at net.sf.jasperreports.engine.xml.JRXmlLoader.load(JRXmlLoader.java:220) at net.sf.jasperreports.engine.xml.JRXmlLoader.load(JRXmlLoader.java:195) at net.sf.jasperreports.engine.xml.JRXmlLoader.load(JRXmlLoader.java:186) at net.sf.jasperreports.engine.JasperCompileManager.compileToFile(JasperCompileManager.java:254) at net.sf.jasperreports.engine.JasperCompileManager.compileReportToFile(JasperCompileManager.java:555) at net.blahovec.framework.report.JasperReportsLib$ReportCompileTaskInternal.compile(JasperReportsLib.java:625) at net.blahovec.pg.lib.PgSysReportLib.reportLayoutJasperCompile(PgSysReportLib.java:44) at net.blahovec.pg.BlahovecNetPg.sysReportLayoutJasperCompile(BlahovecNetPg.java:111) at org.postgresql.pljava.internal@1.6.0-SNAPSHOT/org.postgresql.pljava.internal.EntryPoints.refInvoke(EntryPoints.java:53) Caused by: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 402; schema_reference.4: Failed to read schema document 'dbf://localhost/1057338', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>. at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204) at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:135) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:396) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:306) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaErr(XSDHandler.java:4257) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaError(XSDHandler.java:4240) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.getSchemaDocument1(XSDHandler.java:2531) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.getSchemaDocument(XSDHandler.java:2238) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.parseSchema(XSDHandler.java:588) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.processJAXPSchemaSource(XMLSchemaLoader.java:844) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadSchema(XMLSchemaLoader.java:606) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.findSchemaGrammar(XMLSchemaValidator.java:2710) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleStartElement(XMLSchemaValidator.java:2069) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.startElement(XMLSchemaValidator.java:829) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDriver.scanRootElementHook(XMLNSDocumentScannerImpl.java:613) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3063) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:836) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) at org.apache.commons.digester.Digester.parse(Digester.java:1892) at net.sf.jasperreports.engine.xml.JRXmlLoader.loadXML(JRXmlLoader.java:299) ... 11 more Caused by: java.net.MalformedURLException: unknown protocol: dbf at java.base/java.net.URL.<init>(URL.java:652) at java.base/java.net.URL.<init>(URL.java:541) at java.base/java.net.URL.<init>(URL.java:488) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:649) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:150) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.opti.SchemaParsingConfig.parse(SchemaParsingConfig.java:593) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.opti.SchemaParsingConfig.parse(SchemaParsingConfig.java:696) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.opti.SchemaDOMParser.parse(SchemaDOMParser.java:530) at java.xml/com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.getSchemaDocument(XSDHandler.java:2226) ... 31 more The same code works without any problem when I run it directly in Java (executed from Eclipse in my case). I'm using JDK11.0.6 but I've tried OpenJDK 11.0.7 with the same result.

jcflack commented 4 years ago

Interesting! I take it you used getResource on Class or ClassLoader and it gave you the dbf: URL?

It might be worth seeing whether you can open that URL (with openStream()); if you can, that means the dbf: scheme works for general Java purposes, but is being blacklisted for the XML parser. That's what I suspect is happening, but I'll have to try it to be sure.

If that works, it might give you a workaround: open the URL yourself and get the schema with something like newSchema(new StreamSource(url.openStream())) rather than just newSchema(url). Not seeing more of your code I don't know how well that would fit, but something to try.

Also, please try in a released version of PL/Java: 1.5.5 is the latest. The 1.6 branch is a construction zone; nothing in there is released yet. (Also, part of the construction involves the Java Platform Module System, which could be another reason the dbf: handler might not work there but work in 1.5.5. If you find that to be the case, please report back.)

I don't recommend using 1.6.0-anything before it is released, unless you have ideas for developing PL/Java.

petrjencek commented 4 years ago

Unfortunately the code which is trying to open the DTD schema using dbf url is contained in jasper reports which I'm just including into my project and therefore I'm not able to change the way how the streams are handled. I assume that the branch for the latest stable release is REL_1_5_STABLE. The dbf urls is already in this stable branch as well (see https://github.com/tada/pljava/blob/REL1_5_STABLE/pljava/src/main/java/org/postgresql/pljava/sqlj/Loader.java). I could try to compile the latest stable release, but since the "dbf" streams are already there I don't think it would make any difference.

jcflack commented 4 years ago

The branch is REL1_5_STABLE but still it is best to check out V1_5_5 specifically, as the most recent tagged release. Even branch heads are moving targets between releases.

The dbf: URL support has been tested before; I don't believe it has been broken always.

However, I believe the class loader returns a dbf: URL that is explicitly constructed with the correct stream handler. That URL should work. see

https://github.com/tada/pljava/blob/V1_5_5/pljava/src/main/java/org/postgresql/pljava/sqlj/Loader.java#L319

It is possible that somewhere in the call stack of your example, something is trying to create a new URL by copying only the string representation of the real one. That seems iffy. It will take more time to see what code is doing that and why.

jcflack commented 4 years ago

Do you have access to the XML document being parsed, or is that also buried in the depths of Jasper? It would be good to see exactly how the schema loading attempt is being triggered.

petrjencek commented 4 years ago

When I executed the same code directly I found out that the url is on a specific place in jar file (specifically jar:file:/C:/Users/jencek/.gradle/caches/modules-2/files-2.1/net.sf.jasperreports/jasperreports/6.12.2/c358a8529f7a92ca3780f397b1cdc8a12cc9ab92/jasperreports-6.12.2.jar!/net/sf/jasperreports/engine/dtds/jasperreport-dtd-compat.xsd), so it is buried in the depths of Jasper but not in an unknown way.

jcflack commented 4 years ago

I see now, deep within Java's own XML implementation, a class that only holds the URL as a String, not as the URL that PL/Java supplied, so it has to be re-created later as a non-magical one, and that can't work.

How pressing is your schedule for having this work? I may be able to work up a patch based on 1.5.5 that could temporarily serve your purpose, but I would target a more presentable fix to 1.6.0.

The scheme name dbf was not well-chosen to be registered JVM-wide; there is nothing about it that says "PL/Java" and one could easily imagine other packages with more dbfish names wanting to use it. It was perhaps ok as a magical URL that worked if you got it from PL/Java, but didn't require registering the name to be generally recognized in any URL.

So until something better is chosen, I am resistant to just making that name generally registered by default. If necessary, a patch you could apply would get your application working, as a deliberate opt-in on your part.

jcflack commented 4 years ago

Here's a simpler workaround you might try:

If you did sqlj.install_jar separately for the jasper dependency jar and for your own code jar, try not having the jasper jar installed in the database. Take it out with sqlj.remove_jar, put it somewhere on the PostgreSQL server's filesystem, and add it to the pljava.classpath PostgreSQL setting. (By default, that points to PL/Java's own jar, so if you set it explicitly, you have to include that original value, a colon, and the additional jar).

That is a bit more of a fiddly setup, but then getResource for entries in that jar will not be producing dbf URLs.

petrjencek commented 4 years ago

Thank you very much, I'll try the workaround you've suggested. If this workaround worked then there's no need to hurry - I could live with that until it is resolved generally. I'll let you know the result as soon as I test it.

petrjencek commented 4 years ago

I removed jasperreports from the dependencies of my project (I use gradle to build one jar containing my code together with all dependencies). Then I set pljava.classpath using the following command: set pljava.classpath to 'c:/Program Files/PostgreSQL/10/share/pljava/pljava-1.6.0-SNAPSHOT.jar;c:/Program Files/PostgreSQL/10/share/pljava/jasperreports.jar';

When I executed by function I received the following output:

WARNING: unrecognized configuration parameter "pljava.classpath" ERROR: java.lang.NoClassDefFoundError: net/sf/jasperreports/engine/DefaultJasperReportsContext It seems that pljava.classpath is not recognized and therefore the jar was not loaded. What am I doing wrong?

jcflack commented 4 years ago

Oh right, you're still using 1.6.0-SNAPSHOT. It doesn't have a pljava.classpath (1.6.0, when it is released, will support Java 9 and up, and have a pljava.module_path instead.)

If you switch to using a released version of PL/Java (may I suggest 1.5.5? It goes very well with the salmon.) you can set pljava.classpath.

If for some reason you want to keep living on the bleeding edge with 1.6.0, you will have to add -Djava.class.path=c:/Program Files/PostgreSQL/10/share/pljava/jasperreports.jar as a setting in pljava.vmoptions. (In that case, you do not have to include PL/Java's own jar, which is on the module path instead.)

petrjencek commented 4 years ago

Thank you very much, I've implemented the workaround using pljava.vmoptions in 1.6.0 and after several minor changes in dependencies it worked quite smoothly. Since features provided by PL/java in our enviromnent are not so critical I can live with 1.6.0 for now (I'm too lazy to compile 1.5.5 - see below). However please let me know as soon as the original issue is fixed in 1.6.0 so I can fix it without this workaround (it's not very nice to have jars listed in postgresql.conf).

BTW: It would be nice to have binary distribution at least for the most widely used combinations of OS and PostgreSQL versions of PL/java, because compiling it on Windows cost me many hours and I have to have a special virtual machine just for the purpose of PL/java compilation.

jcflack commented 4 years ago

Could you describe which OSes are in your environment?

There is .deb packaging for Debian and Ubuntu in the PGDG repository.

Likewise, the PGDG yum repository has contained Red Hat / Fedora builds from time to time, but less dependably.

For Windows, there are two flavors, depending on whether PostgreSQL itself was built with MSVC or with MinGW-64. At one time BigSQL had a WIndows-MinGW-64 build available, but BigSQL seems to have become something else and I am unsure of the story there.

There is a Google Summer of Code project in progress that should give us continuous integration, which may better position us to offer more prebuilt binaries.

I am sorry about your experience building for Windows; if you have insights about what took the most time, you could share those (maybe in a separate issue); at minimum, we might be able to improve the Windows build documentation to help something go more smoothly.

I'll just reiterate about 1.6.0-SNAPSHOT that until it is released it is likely to be something different every time you look and no one can say if those somethings will work.

petrjencek commented 4 years ago

Currently we use mainly Windows 2016 Standard (64bit) together with PostgreSQL 10 (64bit) downloaded from https://www.enterprisedb.com/downloads/postgres-postgresql-downloads (probably built on MSVC - I used it to compile PL/java which works with the current version of PostgreSQL which we use).

I can't identify (or remember) any major difficulity in compiling, it was just complicated / time consuming to establish the same environment as on the server (PostgreSQL, Windows 64 bit) together with MSVC 64bit on another machine, especially when we are not very familiar with MSVC and its build tools.

Anyway - it's good to know that this project is being developed and the community cares about users :).