Closed bschmalhofer closed 1 month ago
Tentatively added the commands.
TODO:
Additional fixes to Import Export:
The current state is available at https://github.com/RotherOSS/otobo/tree/issue-%233682-importexport_with_chunking.
Another thing to discuss:
The new files Kernel/System/Console/Command/Admin/ImportExport/Import.pm and Kernel/System/Console/Command/Admin/ImportExport/Export.pm are planned to be added to OTOBO 11.1. For OTOBO 11.0 the plan is to backport these files to the ImportExportTicket package.
It can be considered to integrated the two files from ImportExportTicket into OTOBO 11.1.
<Filelist>
<File Permission="660" Location="Kernel/Config/Files/XML/ImportExportTicket.xml" />
<File Permission="660" Location="Kernel/System/ImportExport/ObjectBackend/Ticket.pm" />
</Filelist>
This would allow to mark ImportExportTicket simply as integrated, without having to consider the two different cases.
Discussed this with @svenoe .
prefix
stating the prefix of the files, default is the empty stringManual testing is underway. But I think that I have misused the tool. I'm wondering whether a simple ETL-Workflow should work.
Let's try that simple flow first. More complex ImportExport just brought a lot of warnings.
Carried out the simple ETL test. This worked after activating "Empty fields indicate that the current values are kept".
Did the same test with 7589 tickets in 76 chunks of 100 tickets. This looked fine as well, after a couple of fixes.
I tried to update the subject of articles by marking the mapping Article_ArticleID
as an identifier. However, the article subject was not updated. Looks like this is not implemented. See https://github.com/RotherOSS/otobo/blob/cd16817710a34480d6b0fb3b998eab3dc798efd2/Kernel/System/ImportExport/ObjectBackend/Ticket.pm#L1941 .
The chunking is not as stable as it should be. When importing two chunks of 1000 tickets, each ticket having 4 articles, I got the following error in var/log/Daemon/SchedulerTaskWorkerERR.log :
DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /opt/otobo/Kernel/System/DB.pm line 586. 2 ERROR: OTOBO-otobo.Daemon.pl - Daemon Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker-10 Perl: 5.40.0 OS: linux Time: Wed Aug 28 15:12:56 2024 3 4 Message: Deadlock found when trying to get lock; try restarting transaction, SQL: ' 5 UPDATE scheduler_task 6 SET lock_key = ?, lock_time = '2024-08-28 15:12:56', lock_update_time = '2024-08-28 15:12:56' 7 WHERE lock_key = 0 AND id = ?' 8 9 Traceback (8472): 10 Module: Kernel::System::Daemon::SchedulerDB::TaskLock Line: 506 11 Module: Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker::Run Line: 180 12 Module: (eval) Line: 387 13 Module: main::Start Line: 387 14 Module: bin/otobo.Daemon.pl Line: 165 15
Importing 10 chunks of 1000 tickets failed on the first try. For some reason no new tickets were created. Howerver this was while running under Devel;;NYTProf.
Another finding from a run where 10x1000 tickets were imported. The import into the database took 1rom 15:28 to 15:54 UTC. But then the process did not finishe till XX:XX UTC. My guess is that the eventhandlers were still running. It feels like events were piling up for all 10 thousand tickets. Two things can be improved there:
$TicketObject->EventHandlerTransaction();
when a Ticket was created or updatedAlso, in the event handling phase:
bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 --prefix 20240828a Export_tickets_20240828a 2>&1 | tee import.out DBD::mysql::db do failed: Deadlock found when trying to get lock; try restarting transaction at /opt/otobo/Kernel/System/DB.pm line 586. ERROR: OTOBO-otobo.Console.pl-Admin::ImportExport::Import-10 Perl: 5.40.0 OS: linux Time: Wed Aug 28 16:05:28 2024
Message: Deadlock found when trying to get lock; try restarting transaction, SQL: ' UPDATE scheduler_task SET lock_key = 0 WHERE lock_key = 1 AND id = ?'
Traceback (4360): Module: Kernel::System::Daemon::SchedulerDB::TaskAdd Line: 183 Module: Kernel::System::Scheduler::TaskAdd Line: 98 Module: Kernel::System::AsynchronousExecutor::AsyncCall Line: 142 Module: Kernel::System::Ticket::Event::TicketAppointments::Run Line: 80 Module: (eval) Line: 241 Module: Kernel::System::EventHandler::EventHandler Line: 235 Module: Kernel::System::EventHandler::EventHandlerTransaction Line: 308 Module: Kernel::System::ObjectManager::ObjectEventsHandle Line: 459 Module: Kernel::System::ObjectManager::ObjectsDiscard Line: 507 Module: Kernel::System::ObjectManager::DESTROY Line: 677 Module: (eval) Line: 40 Module: bin/otobo.Console.pl Line: 40
Activated the handling of transaction per ticket and ran
time -p bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 --prefix 20240828a Export_tickets_20240828a 2>&1
The result was:
Reading file Export_tickets_20240828a/20240828a000008.csv.
Import of 5000 Ticket records:
Success: 5000 succeeded Error: 0 failed. Import of 5000 Ticket records: 1000 Created Import of 5000 Ticket records: 4000 Updated Import complete.
Reading file Export_tickets_20240828a/20240828a000009.csv. Segmentation fault (core dumped) real 2468.73 user 1311.83 sys 275.48 otobo@ba4f7f35a106:~$
This is strange. I tested importing only the chunks 9 and 10. This worked fine. Testing all 10 chunks, I again a core dump after importing the complete chunk 9. Importing chunk 10 by itself worked again.
Pushing the current state. TODO:
can
The adaptions were up to now done for OTOBO 11.1.x. But that is currently not very useful as this version is not released yet. So let's backport it to OTOBO r11.0.x. As no new features should be added in a patch level release, lets include it into the master branch of https://github.com/RotherOSS/ImportExportTicket.
Some more analysis about the core dump. The suspicion is that the stack size or 8 MB is running out. This could be due to XS-code. Two candidates are Text::CSV_XS and the database driver.
It does not look like the reading of the CSV file is the problem. The 209661 lines, corresponding to 50000 entities, of the test iinput can be read in without problems. It is just not clear why the whole file must be read in before parsing begins.
It also looks like the chunking is not really the problem, when importing the combined file the core dump happens after importing 8735 tickets. That is the same ballpark as with chunked import.
Another wild guess. Kernel::System::DB::Connect()
creates an anonymous sub in every call. This sub is usually not needed and it is presumable cleaned up, because it is stored in a lexical variable. But maybe something fishy is going on there.
The wild idea migth have had an effect. With a small code change in Kernel::System::DB::Connect()
the sample file with 10000 tickest could be imported. The stack size grew slowly.
$ pidstat -s -l -C perl
19:44:32 1000 475242 1256 1252 perl bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 Export_tickets_20270828a.csv
But after the last imported ticket there was still a core dump. I suspect that Kernel::System::Ticket::Article
might still have hoarded transaction events.
TODO:
Connect()
method.Another experiment. First I rolled back the changes in Kernel::System::DB::Connect()
ran the case with 10000 tickets again and watched the stack size watch -n 0.2 pidstat -H -s -l -p 15340
.
All 10.000 tickets were imported. During the imports the stack size grew in the same way as without the adapted DB module.
Alle 2,0s: pidstat -H -s -l -p 15340 bernhard-Aspire-A515-57: Sat Aug 31 10:30:03 2024
Linux 6.8.0-40-generic (bernhard-Aspire-A515-57) 31.08.2024 _x8664 (12 CPU)
1725093003 UID PID StkSize StkRef Command 1725093003 1000 15340 1172 1172 perl bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 Export_tickets_20280828a.csv
As the end I briefly say a stack size of over 6000 kB and core was dumped. So this is the first real indication that the stack size limit is reached, presumable during global destruction. The ulimit for stacksize, unlimit -s
, is 8192 kB. So core is dumped, process is terminated, when that limit is reached.
Activated the early triggering of the article transactions. Article transaction are handled just before the next article is created. Some observations:
TODO:
More experiments.
Removing all event handlers still let the stack size grow. So the problem is likely not in one of the event handlers.
Disabling Kernel::System::EventHandler::EventHandlerTransaction()
did the trick.
diff --git a/Kernel/System/EventHandler.pm b/Kernel/System/EventHandler.pm
index c6b66c80f9..fdfc98b0bf 100644
--- a/Kernel/System/EventHandler.pm
+++ b/Kernel/System/EventHandler.pm
@@ -276,6 +276,8 @@ Kernel::System::EventHandler, like this:
sub EventHandlerTransaction {
my ( $Self, %Param ) = @_;
+ return 1;
+
# remember, we are in destroy mode, do not execute new events
$Self->{EventHandlerTransaction} = 1;
The stack size is no longer growing. So disabling EventHandlerTransaction() is recommended for the upcoming test migration.
The reason why the stack size increases is not obvious. I likely has to do with the localized object manager which is destroyed and EventHandlerTransaction() is called again in the destructor.
TODO:
Turns out that this is apparently caused by binmode
which is called in the constructor of Kernel::System::Encode
. Does likely not affect the web app, as binmode
is only called when an interactive tty is attached.
test_stack_size.pl.txt
The fix is to avoid recreating Kernel::System::Encode
.
TODO:
The core dump issue looks like it is solved. Got about 300 kB Stacksize after importing 10000 tickets via the console command.
TODO:
Harald Jörg kindly analyzed the behavior of binmode
, http://www.perl-community.de/bat/poard/thread/21619#ms_196692 . I/O work in Perl in layers, so every call to binmode adds a layer. This is more or less stated in https://perldoc.perl.org/PerlIO . So this is not a bug in Perl, but a consequence of the design.
ImportExportTicket has been integrated into OTOBO 11.1. Closing this issue.
ImportExport is integrated in OTOBO 11.0 and OTOBO 11.1. But there are no console commands that perform Import and Export. So let's provide them. The new commands will be almost exact copies of the commands in ITSMConfigurationManagement, https://github.com/RotherOSS/ITSMConfigurationManagement/tree/master/Kernel/System/Console/Command/Admin/ITSM/ImportExport .
Support for chunking will be added to the commands and to the Translations object backed. Chunking the translations is basically only meant as an example, as it is unlikely that translation will exhaust the machines memory.
Do not add the commands in rel-11_0, as patch releases should have no new features.