popbr / data-integration

Apache License 2.0
1 stars 4 forks source link

Text too big? #15

Closed MNSleeper closed 1 year ago

MNSleeper commented 1 year ago

While running tests on a new db file, I ran into the error

com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'ORG_CITY' at row 1
    at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException (SQLExceptionsMapping.java:104)
    at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal (ClientPreparedStatement.java:953)
    at com.mysql.cj.jdbc.ClientPreparedStatement.execute (ClientPreparedStatement.java:371)
    at popbr.DatabaseIO.WriteToSQL (DatabaseIO.java:822)
    at popbr.DatabaseIO.Parsefromtxt (DatabaseIO.java:400)
    at popbr.DatabaseIO.main (DatabaseIO.java:135)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:254)
    at java.lang.Thread.run (Thread.java:831)

I saw that it was warning about file inputs being too large, so I tried to increase the size of inputs into SQL from Varchar(255) to Varchar(511). I still get the error. It happens for every field, so It's not just the one field being too big. I have linked the file in question, and xlsx sheet from RePoRTER.

Any advice? RePORTER_PRJ_C_FY1986.csv

aubertc commented 1 year ago

I have a similar error message:

[WARNING] 
com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'APPLICATION_ID' at row 1
    at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException (SQLExceptionsMapping.java:104)
    at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal (ClientPreparedStatement.java:953)
    at com.mysql.cj.jdbc.ClientPreparedStatement.execute (ClientPreparedStatement.java:371)
    at popbr.DatabaseIO.WriteToSQL (DatabaseIO.java:824)
    at popbr.DatabaseIO.Parsefromtxt (DatabaseIO.java:400)
    at popbr.DatabaseIO.main (DatabaseIO.java:135)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:254)
    at java.lang.Thread.run (Thread.java:833)
[WARNING] thread Thread[mysql-cj-abandoned-connection-cleanup,5,popbr.DatabaseIO] was interrupted but is still alive after waiting at least 15000msecs
[WARNING] thread Thread[mysql-cj-abandoned-connection-cleanup,5,popbr.DatabaseIO] will linger despite being asked to die via interruption
[WARNING] NOTE: 1 thread(s) did not finish despite being asked to  via interruption. This is not a problem with exec:java, it is a problem with the running code. Although not serious, it should be remedied.
[WARNING] Couldn't destroy threadgroup org.codehaus.mojo.exec.ExecJavaMojo$IsolatedThreadGroup[name=popbr.DatabaseIO,maxpri=10]
java.lang.IllegalThreadStateException
    at java.lang.ThreadGroup.destroy (ThreadGroup.java:803)
    at org.codehaus.mojo.exec.ExecJavaMojo.execute (ExecJavaMojo.java:293)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:77)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:568)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  48.465 s
[INFO] Finished at: 2022-11-01T10:42:51-04:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.0.0:java (default-cli) on project Database-IO: An exception occured while executing the Java class. Data truncation: Data too long for column 'APPLICATION_ID' at row 1 -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
aubertc commented 1 year ago

With https://github.com/popbr/data-integration/commit/89705f0d1a719ad76729619cf00b9ac5690a0630 I have starting to investigate, but have to stop now.

The program displays, before the error,

TagName:APPLICATION_ID
 data: APPLICATION_ID
TagName:ORG_CITY
 data: ORG_CITY
TagName:ORG_NAME
 data: ORG_NAME

even if I delete the first row of the .csv file, which I find very weird. Thoughts?

aubertc commented 1 year ago

If you change

https://github.com/popbr/data-integration/blob/12e36bac1d6d0eea6fa1ab7bb13bb622d0c6f238/Project/Database-IO/src/main/java/popbr/DatabaseIO.java#L774

to

CreateTable += TagName[j] + " TEXT";

then the insertion works, the linkage complains about java.sql.SQLException: Illegal operation on empty result set., but the data inserted is trash:

|    9049 | 3594718,N01,AI,3,,,03/26/1986,06/30/1987,855,N01AI052588,SCHOOLS OF MEDICINE,,N01AI052588-001,,,1986,NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES,,LEXINGTON,UNITED STATES,,6,832127323,US,UNIVERSITY OF KENTUCKY,KY,405060057,,3312037,,"THOMPSON, JOHN S;",,07/01/1985,05/31/1988,homologous transplantation,human therapy evaluation,kidney transplantation,monoclonal antibody,kidney function,T lymphocyte,antibody specificity,immunotherapy,immunosuppression,human subject,,ONOCLONAL ANTIBODIES TO TREAT KIDNEY GRAFT REJECTION,52588,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,                                                                                                                                                                                                                                                                                                                                                                                                                                         | NULL                                                                                                                                                              | NULL     |
|    9050 | 3594724,N01,AI,5,,,02/25/1986,08/02/1987,856,N01AI052590,SCHOOLS OF ARTS AND SCIENCES,,N01AI052590-002,,,1986,NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES,,LOWELL,UNITED STATES,,5,956072490,US,UNIVERSITY OF MASSACHUSETTS LOWELL,MA,1854,,3304293,,"BRUCE, JOHN I;",,08/03/1985,08/02/1990,schistosomiasis,animal colony,disease vectors,,SUPPLY SCHISTOSOME INFECTED SNAILS AND MAMMALS,52590,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | NULL                                                                                                                                                              | NULL     |
|    9051 | 3594725,N01,AI,3,,,04/24/1986,08/02/1987,,N01AI052590,SCHOOLS OF ARTS AND SCIENCES,,N01AI052590-003,,,1986,NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES,,LOWELL,UNITED STATES,,5,956072490,US,UNIVERSITY OF MASSACHUSETTS LOWELL,MA,1854,,3304293,,"BRUCE, JOHN I;",,08/03/1985,08/02/1990,,SUPPLY SCHISTOSOME INFECTED SNAILS AND MAMMALS,52590,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | NULL                                                                                                                                                              | NULL     |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+

etc. The " are probably playing a trick on you, in any case the value is indeed "too big" as we capture "too much" to insert.

aubertc commented 1 year ago

Can we close this, @MNSleeper ?

MNSleeper commented 1 year ago

Yes