csirtgadgets / massive-octo-spice

DEPRECATED - USE v3 (bearded-avenger)
https://github.com/csirtgadgets/bearded-avenger-deploymentkit/wiki
GNU Lesser General Public License v3.0
227 stars 60 forks source link

migrate bug? #397

Closed villain closed 8 years ago

villain commented 8 years ago

would like to know if anyone else see's the following after migrating a lot of data from a v1 instance;

cif@cif:/var/tmp/massive-octo-spice/v1migration$ perl -I../src/lib -Ilib bin/migrate-data.pl --threads 12 --psql-host x.x.x.x --es-token token --es-host x.x.x.x [2016-03-27T11:17:23,249Z][6999][INFO]: staring up.. [2016-03-27T11:17:23,250Z][6999][INFO]: starting up ES connection... [2016-03-27T11:17:23,250Z][6999][INFO]: checking journal: /tmp/cif-migrate.journal [2016-03-27T11:17:23,250Z][6999][INFO]: creating threads... [2016-03-27T11:17:23,389Z][6999][INFO]: starting workers [2016-03-27T11:44:31,918Z][6999][INFO]: starting writer thread... Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278. Label empty at /usr/share/perl5/URI/_server.pm line 25 thread 5.

zmq_getsockopt: Socket operation on non-socket at /usr/local/share/perl/5.18.2/ZMQ/FFI/ErrorHelper.pm line 49 thread 1. ZMQ::FFI::ErrorHelper::fatal('ZMQ::FFI::ErrorHelper=HASH(0x3724f08)', 'zmq_getsockopt') called at /usr/local/share/perl/5.18.2/ZMQ/FFI/ErrorHelper.pm line 28 thread 1 ZMQ::FFI::ErrorHelper::check_error('ZMQ::FFI::ErrorHelper=HASH(0x3724f08)', 'zmq_getsockopt', -1) called at (eval 174) line 17 thread 1 ZMQ::FFI::ErrorHandler::check_error('ZMQ::FFI::ZMQ3::Socket=HASH(0x7ff2b40a2b50)', 'zmq_getsockopt', -1) called at /usr/local/share/perl/5.18.2/ZMQ/FFI/SocketBase.pm line 224 thread 1 ZMQ::FFI::SocketBase::get('ZMQ::FFI::ZMQ3::Socket=HASH(0x7ff2b40a2b50)', 15, 'int') called at /usr/local/share/perl/5.18.2/ZMQ/FFI/SocketBase.pm line 180 thread 1 ZMQ::FFI::SocketBase::has_pollin('ZMQ::FFI::ZMQ3::Socket=HASH(0x7ff2b40a2b50)') called at bin/migrate-data.pl line 289 thread 1 main::pager_routine() called at bin/migrate-data.pl line 159 thread 1 eval {...} called at bin/migrate-data.pl line 159 thread 1

Segmentation fault (core dumped)

the cif-migrate.journal is quite large;

$ cat /tmp/cif-migrate.journal 82963233

in addition, if i remove all of the indexes from ES and remove the cif-migrate.journal file, it all works fine

wesyoung commented 8 years ago

hey bud, i've seen some similar stuff when migrating and pushed some of the fixes (into develop/v1migration) i'm currently testing. the caveat here is that this will start creating "monthly" es indicies which we introduced as an option in RC.12, while it shouldn't affect your setup; it might make sense to try this with RC12 on the other side...

still a work in progress as we're testing the move of a lot of data, so we might find other stuff, this is where we're at right now.

wesyoung commented 8 years ago

https://github.com/csirtgadgets/massive-octo-spice/commit/04cf46673f36cced6b541a4e2c2abdbc14d41589

villain commented 8 years ago

i have the monthly indexes working, but the migrate-data.pl still seems to be creating daily indexes?

wesyoung commented 8 years ago

on the host you're using to migrate, may need to replace ~/massive-octo-spice-... with the latest release where lib/CIF/Storage/ElasticSearch.pm has been updated to understand monthly bits. (v1migration/bin/migrate.. is using that local library to push data to ES).

when i'm doing it, usually i'm on a separate host that connects to my old instance via psql and the new instance via http (elasticsearch) so i have the source checked out to ~/massive-octo-spice and am running it from within there..

does that sorta make sense? (this is messy.. i know).

villain commented 8 years ago

yep, makes sense. i blew away the VM i created to do the migration, did a reinstall and ended up with;

[2016-04-02T14:46:41,799Z][30613][INFO]: staring up.. [2016-04-02T14:46:41,800Z][30613][INFO]: starting up ES connection... [2016-04-02T14:46:41,800Z][30613][INFO]: checking journal: /tmp/cif-migrate.journal [2016-04-02T14:46:41,800Z][30613][INFO]: creating threads... [2016-04-02T14:46:41,969Z][30613][INFO]: starting workers [2016-04-02T15:25:46,996Z][30613][INFO]: starting writer thread... Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278. hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at bin/migrate-data.pl line 282.

Segmentation fault (core dumped)

a5m0 commented 8 years ago

I am having the same issue:

user@server:~/massive-octo-spice/v1migration$ perl -I../src/lib -Ilib bin/migrate-data.pl --threads 4 --psql-host 10.x.x.x --es-token xxxxxxxxxxxxxxx [2016-04-04T16:11:21,743Z][5687][INFO]: staring up.. [2016-04-04T16:11:21,743Z][5687][INFO]: starting up ES connection... [2016-04-04T16:11:21,744Z][5687][INFO]: checking journal: /tmp/cif-migrate.journal [2016-04-04T16:11:21,744Z][5687][INFO]: creating threads... [2016-04-04T16:11:21,817Z][5687][INFO]: starting workers [2016-04-04T16:36:51,937Z][5687][INFO]: starting writer thread... Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278. hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at bin/migrate-data.pl line 281.

Segmentation fault (core dumped)

wesyoung commented 8 years ago

what does Data::Dumper say for those lines..? chances are there might be some bad data trying to get passed through that you might have to check for and skip... bin/migrate-data.pl:281

a5m0 commented 8 years ago

Sorry I'm not that familiar with perl/Cif is there a command I can run for what you're asking?

wesyoung commented 8 years ago

sorry; i mis-spoke on the line number:

https://github.com/csirtgadgets/massive-octo-spice/blob/develop/v1migration/bin/migrate-data.pl#L445

add a statement that looks like:

warn Dumper($data);

if($data->{'address'}){
        $data->{'address'} =~ s/hxxp\:\/\///g;
}

and then run the migrate script with the "-t 1" flag to only use one thread, it should fail rather quickly and show us what the data looks like that it might be choking on...

a5m0 commented 8 years ago

Made the insertion like you showed but the output was exactly the same. To confirm starting at line 440:

$data = @$data[0];
return '-1' unless $data;

warn Dumper($data);

if($data->{'address'}){
    $data->{'address'} =~ s/hxxp\:\/\///g;
}

and same error: user@server:~/massive-octo-spice/v1migration$ perl -I../src/lib -Ilib bin/migrate-data.pl --threads 1 --psql-host 10.x.x.x --es-token xxxxxx
[2016-04-12T13:28:07,697Z][1170][INFO]: staring up.. [2016-04-12T13:28:07,698Z][1170][INFO]: starting up ES connection... [2016-04-12T13:28:07,698Z][1170][INFO]: checking journal: /tmp/cif-migrate.journal [2016-04-12T13:28:07,698Z][1170][INFO]: creating threads... [2016-04-12T13:28:07,781Z][1170][INFO]: starting workers [2016-04-12T14:04:55,056Z][1170][INFO]: starting writer thread... Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278. hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at bin/migrate-data.pl line 281.

Segmentation fault (core dumped)

and /tmp/cif-migrate.journal is 0

giovino commented 8 years ago

debug output... it does not appear there is a failure on the initial count query

[2016-04-13T12:27:42,477Z][19913][INFO][main:136]: staring up..
[2016-04-13T12:27:42,477Z][19913][INFO][main:149]: starting up ES connection...
[2016-04-13T12:27:42,478Z][19913][INFO][main:155]: checking journal: /tmp/cif-migrate.journal
[2016-04-13T12:27:42,478Z][19913][INFO][main:158]: creating threads...
$VAR1 = 1;
[2016-04-13T12:27:42,563Z][19913][INFO][main:214]: starting workers
[2016-04-13T12:27:42,563Z][19913][DEBUG][main:224]: connecting to archive..
[2016-04-13T13:03:40,405Z][19913][DEBUG][main:253]: total count: 125498850
[2016-04-13T13:03:40,406Z][19913][DEBUG][main:254]: pages: 1254989
[2016-04-13T13:03:40,407Z][19913][DEBUG][main:261]: sending ctrl warm-up msg...
[2016-04-13T13:03:40,492Z][19913][DEBUG][main:267]: creating 4 worker threads...
[2016-04-13T13:03:40,491Z][19913][INFO][main:323]: starting writer thread...
Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278.
$VAR1 = {
    ...entries removed...
hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at bin/migrate-data.pl line 284.

[2016-04-13T13:03:40,724Z][19913][DEBUG][main:274]: executing sql...
[2016-04-13T13:03:40,734Z][19913][DEBUG][main:282]: sending next pages to workers...
Segmentation fault (core dumped)
giovino commented 8 years ago

Turns out a5m0 was trying to migrate from a CIFv0 instance, not sure if villain is continuing to have this problem. I'm closing this issue for now, please reopen or create a new one if there are continued problems. Thanks!

villain commented 8 years ago

yep, still having the problem. just did another git pull, getting the same error. i'm migrating from a v1 instance

[2016-04-26T08:42:36,147Z][12427][INFO]: staring up.. [2016-04-26T08:42:36,148Z][12427][INFO]: starting up ES connection... [2016-04-26T08:42:36,149Z][12427][INFO]: checking journal: /tmp/cif-migrate.journal [2016-04-26T08:42:36,149Z][12427][INFO]: creating threads... [2016-04-26T08:42:36,438Z][12427][INFO]: starting workers [2016-04-26T10:14:57,394Z][12427][INFO]: starting writer thread... Subroutine CIF::Legacy::Archive::db_Main redefined at /usr/local/share/perl/5.18.2/Ima/DBI.pm line 278. hash- or arrayref expected (not a simple scalar, use allow_nonref to allow this) at bin/migrate-data.pl line 282.

Segmentation fault (core dumped)

it was working ok until the more recent changes to the migrate script