schemaspy / schemaspy

Database documentation built easy
http://schemaspy.org
GNU Lesser General Public License v3.0
3.21k stars 314 forks source link

Failed to parse [schema].meta.xml #1420

Closed hansena closed 8 months ago

hansena commented 8 months ago

Expected Behavior

SchemaMeta should be processed and generate a foreign key relationship.

Current Behavior

INFO  - Analyzing 'discovery_apic4e'
INFO  - Starting schema analysis
DEBUG - DbSpecificOption name: 'hostOptionalPort' value: '192.168.0.1:5432' description: 'null' 
DEBUG - DbSpecificOption name: 'db' value: 'discovery' description: 'database name' 
DEBUG - supportsSchemasInTableDefinitions: true 
DEBUG - supportsCatalogsInTableDefinitions: false 
INFO  - Parsing /Users/n0145598/__github/db-discovery-schema/.schemaspy/discovery_apic4e.meta.xml 
[Fatal Error] :1:1: Premature end of file.
DEBUG - Command line parameters: [-dp, /drivers_inc/, -o, /output, -meta, /Users/n0145598/__github/db-discovery-schema/.schemaspy/, -host, 192.168.0.1, -debug] 
ERROR - Bad config 
org.schemaspy.model.InvalidConfigurationException: Failed to parse /Users/n0145598/__github/db-discovery-schema/.schemaspy/discovery_apic4e.meta.xml
        at org.schemaspy.input.dbms.xml.SchemaMeta.parse(SchemaMeta.java:165)
        at org.schemaspy.input.dbms.xml.SchemaMeta.<init>(SchemaMeta.java:89)
        at org.schemaspy.SchemaAnalyzer.analyze(SchemaAnalyzer.java:254)
        at org.schemaspy.SchemaAnalyzer.analyzeMultipleSchemas(SchemaAnalyzer.java:186)
        at org.schemaspy.SchemaAnalyzer.analyze(SchemaAnalyzer.java:127)
        at org.schemaspy.cli.SchemaSpyRunner.runAnalyzer(SchemaSpyRunner.java:109)
        at org.schemaspy.cli.SchemaSpyRunner.run(SchemaSpyRunner.java:98)
        at org.schemaspy.Main.main(Main.java:55)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
        at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
        at org.springframework.boot.loader.Launcher.launch(Launcher.java:51)
        at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:52)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
        at java.xml/com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:261)
        at java.xml/com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
        at java.xml/javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:206)
        at org.schemaspy.input.dbms.xml.SchemaMeta.parse(SchemaMeta.java:163)
        ... 15 common frames omitted
Error: exit status 4

Steps to Reproduce

  1. Configure discovery_apic4e.meta.xml as:

    <?xml version="1.0" encoding="UTF-8"?>
    <schemaMeta xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://schemaspy.sourceforge.net/xmlschema/2011/02/05/schemaspy.meta.xsd">
    <tables>
        <table name="linter_execution_details">
            <column name="event_id" type="TEXT">
                <foreignKey table="linter_executions" column="event_id" />
            </column>
        </table>
        <table name="linter_executions">
            <column name="event_id" type="TEXT"/>
        </table>
    </tables>
    </schemaMeta>
  2. Configure schemaspy.properties as:

    # type of database. Run with -dbhelp for details
    schemaspy.t=pgsql
    # optional path to alternative jdbc drivers.
    schemaspy.dp=/Applications/DbVisualizer.app/Contents/java/app/jdbc/postgresql/postgresql.jar
    # database properties: host, port number, name user, password
    schemaspy.cat=discovery
    schemaspy.port=5432
    schemaspy.db=discovery
    schemaspy.u=username
    schemaspy.p=password
    # output dir to save generated files
    schemaspy.o=$PWD/schemaspy
    # db scheme for which generate diagrams
    schemaspy.schemas=discovery_apic4e
    schemaspy.meta=$PWD/.schemaspy/
  3. Run schemaspy in a docker container using the command:

    nerdctl run \
    -v "$PWD/schemaspy:/output" \
    -v "$PWD/.schemaspy/schemaspy.properties:/schemaspy.properties" \
    -v "$PWD/.schemaspy/discovery_apic4e.meta.xml" \
    schemaspy/schemaspy:latest \
    -meta "$PWD/.schemaspy/" \
    -host 192.168.0.1:5432 \
    -debug`
  4. Review stdout

Context

Your Environment

hansena commented 8 months ago

Totally open to this being entirely on me missing something obvious but I've done more than a few laps on this and can't seem to turn up anything that seems like it would cause this to fail. Any direction will be greatly appreciated.

npetzall commented 8 months ago

Hi, sorry for late reply.

One thing that I'm unsure of is the validation. I know we have it in the code, but the noNamespaceSchemaLocation needs to be updated (possibly).

But that might be an issue after the parsing has passed.

I'll try to test a bit if I get time tonight. But some common problem that I've experienced with different parsers are.

File has switched encoding and there is a non printable char in the beginning of the file.

File isn't actually UTF-8 encoded.

As far as SchemaSpy goes it seems to think that the file exist and it runs in problems when trying to parse.

Secondly, which I think might be a typo either in command or when added to GitHub is the mount argument which might get treated as a volume and folder.

I checked the code last night and we might only look if it exists and not if it's a volume.

But as I wrote earlier I don't think that the mount works

-v "$PWD/.schemaspy/discovery_apic4e.meta.xml"

Since it's missing target/destination in container.

npetzall commented 8 months ago

Totally unrelated to the issue, but I saw you are using MacOS, if your using Apple Silicon, you can use tag snapshot and get better performance since it has an arm64 image.

npetzall commented 8 months ago

I've also verified that if the mount is wrong. SchemaSpy will try to parse a folder and that's wrong.

Need to add som validation that we are actually trying to read a file and that we are able to read it. So that is something we can do in regards to error handling.

As for the validation that I mentioned, it must be relaxed and ignore the defined one.

I copied the meta xml above and added it to folder and specified the folder as meta. I was able to parse it without issues.

So either it's the file encoding issue or you mount is not correct.

You can add to nerdctl -ti --rm --entrypoint sh and remove alla arguments after schemaspy/schemaspy:latest this will drop you into a shell with in the container and you can validate the mounts and also run vi and check the contents.

hansena commented 8 months ago

@npetzall you’re 100% right on the bad mount. Thanks a ton for the quick response