Closed mihailefter closed 7 years ago
We went for the second solution in order not to alter the database. We implemented the following:
__alterBatchEntries
function not to alter entries which have the 'S2' flag.__processFlags
function we check for the 'S2' flag and output a message to mention that the line was not processed since it had a length greater than 190.We also discovered that "__alterBatchEntries" changes the input sequence when one accession is a substring of another accession which appears later in the file, since the first part of the latest one will be replaced.
Input file sequence:
NM_001315:c.723_723delinsAC
NM_001315507:c.1826_1831G
During the processing of NM_001315:c.723_723delinsAC
the NM_001315507:c.1826_1831G
is replaced to NM_001315.2507:c.1826_1831G
.
Problem
It looks like the batch processor crashes in the
__alterBatchEntries
function. Considering an input file with the following contents:During the processing of the first job (which proceeds without crashing) Mutalyzer fetches the most recent version for NM_024690, which is NM_024690.2. Next it tries to update any other entries in the
batch_queue_items
database table which utilize only NM_024690 to the most recent version. This is done in order to speed up the batch process when those jobs are reached. In this case it tries to update the second job. The information stored in theitem
column of thebatch_queue_items
table for the second job has 200 characters, which is to be replaced by a larger one, of 202 characters. Since this is greater than the maximum allowed, the query results in an error:It seems that an input line is automatically truncated to 200 characters when added to the database, so no error appears there, but during the replace operation the truncation is no longer performed.
Possible solutions
item
column type inbatch_queue_items
table to a variable unlimited length type. This is supported by PostgreSQL as type text but didn't check for other SQL database management systems.