ecrin-github / MDR_Aggregator

MIT License
1 stars 0 forks source link

Phase 5: ***ERROR*** In ExecuteSQL; 54000: out of memory #15

Open michelescarlato opened 1 month ago

michelescarlato commented 1 month ago

After applying the changes mentioned in #9, now there is an out of memory error:

17/09/2024 : 08:19 :   Updated temp_conditions_by_study data, ids 2525001 to 2550000
17/09/2024 : 08:19 :   Updated temp_conditions_by_study data, ids 2550001 to 2575000

+++++++++++++++++++++++++++++++++++++++
17/09/2024 : 08:19 :   ***ERROR*** In ExecuteSQL; 54000: out of memory

DETAIL: Cannot enlarge string buffer containing 1073741813 bytes by 81 more bytes., 
SQL was: insert into core.temp_conditions_by_study(study_id, condition_concat)
                     select study_id, string_agg(condition_text, ' ')
                     from core.temp_conditions s  where study_id >= 2575001 and study_id < 2600001  group by study_id
+++++++++++++++++++++++++++++++++++++++

17/09/2024 : 08:19 :   Updated temp_conditions_by_study data, ids 2575001 to 2600000

+++++++++++++++++++++++++++++++++++++++
17/09/2024 : 08:20 :   ***ERROR*** In ExecuteSQL; 54000: out of memory

DETAIL: Cannot enlarge string buffer containing 1073741784 bytes by 82 more bytes., 
SQL was: insert into core.temp_conditions_by_study(study_id, condition_concat)
                     select study_id, string_agg(condition_text, ' ')
                     from core.temp_conditions s  where study_id >= 2600001 and study_id < 2625001  group by study_id
+++++++++++++++++++++++++++++++++++++++

17/09/2024 : 08:20 :   Updated temp_conditions_by_study data, ids 2600001 to 2625000
17/09/2024 : 08:20 :   Updated temp_conditions_by_study data, ids 2625001 to 2650000
michelescarlato commented 1 month ago

I commented

if (opts.do_indexes)
        {
            // There are two aspects of setting up search data. One is to create searchable
            // tables to respond to queries and filters. The other is to set up
            // suitable JSON fields to return to the UI in response to those queries.
            // The lup schemas from context, as well as the aggs schemas, are required.

            string core_conn_string = _credentials.GetConnectionString("mdr");
            List<string> aggs_schemas = new(){"st", "ob", "nk"};
            _monDatalayer.SetUpTempFTWs(_credentials, core_conn_string, "core", "aggs", aggs_schemas);
            List<string> ctx_schemas = new(){"lup"};
            _monDatalayer.SetUpTempFTWs(_credentials, core_conn_string, "core", "context", ctx_schemas);
            _loggingHelper.LogLine("FTW tables recreated");

            // Initial task is to create JSON versions of the object data (as part of this will be
            // incorporated into study json and tables, from where it can be returned when necessary).
            // Querying and filtering is almost always done against studies rather than objects - the
            // exception being PMIDs and even then it is studies that are returned.
            // Preparing object data is therefore focused on creating the JSON required. The routine
            // below generates both a 'full' JSON image of each object plus a much smaller JSON
            // fragment that will be returned within search results.

            CoreSearchBuilder csb = new CoreSearchBuilder(core_conn_string, _loggingHelper);

            _loggingHelper.LogHeader("Creating JSON object data");
            csb.CreateJSONObjectData();  

            // Tables are then created to hold data for querying in various ways

            _loggingHelper.LogHeader("Setting up study search tables");

            //csb.CreateIdentifierSearchDataTable();

            _loggingHelper.LogHeader("Setting up PMID search data tables");

            //csb.CreatePMIDSearchDataTable();

            _loggingHelper.LogHeader("Setting up LEXEME search data tables");

            //csb.CreateLexemeSearchDataTable();

            _loggingHelper.LogHeader("Setting up Country search data tables");

            //csb.CreateCountrySearchDataTable();  

            // The study data json objects are then created

            _loggingHelper.LogHeader("Creating JSON study data");
            csb.CreateJSONStudyData();

            _loggingHelper.LogHeader("Creating data in search tables");
            csb.AddStudyJsonToSearchTables();
            csb.SwitchToNewTables();

            // Drop FTW schemas.

            _monDatalayer.DropTempFTWs(core_conn_string, "aggs", aggs_schemas);
            _monDatalayer.DropTempFTWs(core_conn_string, "context", ctx_schemas);
            _loggingHelper.LogLine("FTW tables dropped");
        }

And the Indexing didn't terminate AG -X 2024-09-17 084505.log

michelescarlato commented 1 month ago

Anyway, seems that it is still running image

michelescarlato commented 1 month ago

As soon as it terminates, I will try to increase the number of records added before adding a log entry. Now is set to 100, and I would like to increase it to 10000.

michelescarlato commented 1 month ago

It is still running

17/09/2024 : 13:20 :   41500 records processed
17/09/2024 : 13:20 :   41600 records processed
17/09/2024 : 13:20 :   41700 records processed
17/09/2024 : 13:20 :   41800 records processed
17/09/2024 : 13:20 :   41900 records processed
17/09/2024 : 22:03 :   42000 records processed
17/09/2024 : 22:03 :   42100 records processed
17/09/2024 : 22:03 :   42200 records processed
17/09/2024 : 22:03 :   42300 records processed
17/09/2024 : 22:03 :   42400 records processed
17/09/2024 : 22:03 :   42500 records processed
17/09/2024 : 22:03 :   42600 records processed

AG -X 2024-09-17 084505.log

image

michelescarlato commented 1 month ago

Still running

20/09/2024 : 05:47 :   372400 records processed
20/09/2024 : 05:47 :   372500 records processed
20/09/2024 : 05:47 :   372600 records processed
20/09/2024 : 05:47 :   372700 records processed
20/09/2024 : 05:47 :   372800 records processed
20/09/2024 : 05:47 :   372900 records processed
20/09/2024 : 05:47 :   373000 records processed
20/09/2024 : 05:47 :   373100 records processed
20/09/2024 : 05:47 :   373200 records processed

image

michelescarlato commented 1 month ago

to date, still running ... AG -X 2024-09-17 084505.log

michelescarlato commented 1 month ago
23/09/2024 : 16:24 :   718500 records processed

23/09/2024 : 16:28 :   **** UNHANDLED EXCEPTION ****

+++++++++++++++++++++++++++++++++++++++
23/09/2024 : 16:28 :   ***ERROR*** MDR_Aggregator application aborted

Exception of type 'System.OutOfMemoryException' was thrown.

   at System.Text.Json.JsonReaderHelper.TranscodeHelper(ReadOnlySpan`1 utf8Unescaped)
   at System.Text.Json.JsonSerializer.WriteString[TValue](TValue& value, JsonTypeInfo`1 jsonTypeInfo)
   at System.Text.Json.JsonSerializer.Serialize[TValue](TValue value, JsonSerializerOptions options)
   at MDR_Aggregator.SearchHelperJson.LoopThroughStudyRecords(Int32 offset) in C:\Users\ecrinadmin\source\repos\MDR_Aggregator\SearchHelpers\SearchHelperJSON.cs:line 90
   at MDR_Aggregator.CoreSearchBuilder.CreateJSONStudyData(Boolean create_table, Int32 offset) in C:\Users\ecrinadmin\source\repos\MDR_Aggregator\SearchHelpers\CoreSearchBuilder.cs:line 35
   at MDR_Aggregator.Aggregator.AggregateData(Options opts) in C:\Users\ecrinadmin\source\repos\MDR_Aggregator\TopLevelClasses\Aggregator.cs:line 253
   at Program.<Main>$(String[] args) in C:\Users\ecrinadmin\source\repos\MDR_Aggregator\TopLevelClasses\Program.cs:line 62
+++++++++++++++++++++++++++++++++++++++

23/09/2024 : 16:28 :   **** CLOSING LOG ****

It terminated with out of memory.