IHTSDO / rf2-to-rf1-conversion

A utility for converting an RF2 archive into RF1 without reliance on additional information such as a compatibility package
Other
0 stars 1 forks source link

NULL relationship ID in converting January 2019 Alpha release #4

Closed PeterTrigg closed 5 years ago

PeterTrigg commented 6 years ago

I'm trying to convert (tool version 1.4.0) the January 2019 Alpha release (xSnomedCT_InternationalRF2_ALPHA_20190131T120000Z.zip) and get the following:

Converting RF2 to RF1... 98.42% complete. Process failed in 1.995 h after completing 436/443 operations. Relationship Ids Issued: 9093 Relationship Ids Skipped (already in use): 30906 Relationship Ids remaining: 0 Relationship Ids lacking: 0 Cleaning up resources... 98.87% complete.Exception in thread "main" java.lang.RuntimeException: Failed to execute SQL Statement at org.ihtsdo.snomed.rf2torf1conversion.DBManager$StatementRunner.run(DBManager.java:194) at org.ihtsdo.snomed.rf2torf1conversion.DBManager.runStatement(DBManager.java:159) at org.ihtsdo.snomed.rf2torf1conversion.DBManager.executeResource(DBManager.java:95) at org.ihtsdo.snomed.rf2torf1conversion.ConversionManager.convert(ConversionManager.java:499) at org.ihtsdo.snomed.rf2torf1conversion.ConversionManager.doRf2toRf1Conversion(ConversionManager.java:286) at org.ihtsdo.snomed.rf2torf1conversion.ConversionManager.main(ConversionManager.java:204) Caused by: org.h2.jdbc.JdbcSQLException: Data conversion error converting "'' (RF21_STATED_REL: RELATIONSHIPID BIGINT SELECTIVITY 100)"; SQL statement: UPDATE rf21_stated_rel r SET RELATIONSHIPID = relationshipIdFor (r.conceptid1, r.relationshiptype, r.conceptid2, r.relationshipgroup, true) WHERE r.relationshipid is null [22018-193] at org.h2.message.DbException.getJdbcSQLException(DbException.java:345) at org.h2.message.DbException.get(DbException.java:179) at org.h2.message.DbException.get(DbException.java:155) at org.h2.table.Column.convert(Column.java:154) at org.h2.command.dml.Update.update(Update.java:121) at org.h2.command.CommandContainer.update(CommandContainer.java:98) at org.h2.command.Command.executeUpdate(Command.java:258) at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:184) at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:158) at org.ihtsdo.snomed.rf2torf1conversion.DBManager$StatementRunner.run(DBManager.java:183) ... 5 more

pgwilliams commented 6 years ago

Hi Peter. I wasn't able to replicate this issue, and I suspect that's because I'm not passing in a previous RF1 package. I was converting this: package xSnomedCT_InternationalRF2_ALPHA_20190131T120000Z.zip with md5 dcdec4d62a9b75da4972c00c4a6b2080

Could you give me the command line arguments you used, and also can I get hold of the previous RF1 package if you specified it? Thanks! Peter

PeterTrigg commented 5 years ago

I'll look at a method of getting the old RF1 zip archive to you (too large for and attachment here). The script (go.bat) used to run the conversion is:

@echo off

SET "memParams=-Xms2g -Xmx10g"
SET debugParams=
SET "rf2Archive=D:\Release\ReleaseBuild\RF2toRF1\xSnomedCT_InternationalRF2_ALPHA_20190131T120000Z.zip"
SET "secondDrive=E:\" 

SET newMemory=
set /p newMemory="How much memory do you have available? [10g]: "
IF NOT [%newMemory%]==[] SET "memParams=-Xms2g -Xmx%newMemory%"

SET driveParam=
set /p driveAvailable="Do you have a 2nd drive? (eg %secondDrive%) Y/N: "
IF /I "%driveAvailable%"=="Y" SET "driveParam=-u %secondDrive%"

SET newLocation=
SET /p newLocation="Where is the RF2 Archive? [%rf2Archive%]: "
IF NOT [%newLocation%]==[] SET "rf2Archive=%newLocation%"

FOR %%a IN (%*) DO (
    IF /I "%%a"=="-d" SET "debugParams=-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8080"
)

@echo on
java -d64 %memParams% %debugParams% -jar RF2toRF1Converter.jar %driveParam% %* %rf2Archive% -p "D:\Release\ReleaseBuild\RF2toRF1\SnomedCT_RF1Release_INT_20180731.zip"
PeterTrigg commented 5 years ago

Peter, dropped a copy of the previous RF1 conversion onto Google Drive. You should already have access and here is a link: <link removed - thanks Peter!>

pgwilliams commented 5 years ago

Hi Peter. It turns out that we run out of pre-allocated relationship SCTIDs in this run. Which the process is supposed to detect and tell you how many more it needs, but an unexpected carriage return in the file meant that it fell over trying to insert something invalid into the database.

So I corrected (and tested) that fault so that we can handle that sort of file edit problem, and make it clearer when more - and how many - SCTIDs are required. I then added 30K more relationship ids which should see us through to 2021. That tested fine on my machine.

So if you could please pick up the latest version of the tool from the develop branch and once you've confirmed you're happy it's fixed I'll merge that into the master branch. Best Wishes, Peter

PeterTrigg commented 5 years ago

Peter, that appear to have gone through without issue. Summary of relationship IDs is:

Process completed in 1.234 h after completing 459/443 operations. Relationship Ids Issued: 11419 Relationship Ids Skipped (already in use): 30906 Relationship Ids remaining: 27673 Relationship Ids lacking: 0

Thank you.