jolicode / elastically

🔍 JoliCode's Elastica wrapper to bootstrap Elasticsearch PHP integrations
248 stars 37 forks source link

Get new index instance after re-indexing #89

Closed bogdandubyk closed 3 years ago

bogdandubyk commented 3 years ago

Hi, I'm writing tests for reindexing CLI command (I'm using Symfony),

  1. I'm creating 10 records in the database
  2. but only 5 documents in an index
  3. making sure the index exists and has 5 documents
  4. run CLI command which should replace the old index (with 5 documents) with a new one with 10 documents (as we have 10 entire in DB)
  5. make sure that the index (with the same name) now has 10 documents

Here is the code (sorry for any sort of typos, I change code a bit):

 public function testReindex(): void
    {
       $indexName = 'some_index';
        /** @var EntityManagerInterface $entityManager */
        $entityManager = self::$container->get(EntityManagerInterface::class);

        // create 10 records but only 5 add to index
        $records = [];
        $dtos = [];
        for ($i = 1; $i < 11; $i++) {
            $record = $this->createDbRecord($i, $entityManager);
            if ($i < 6) {
               $dtos = $this->createDto( $record);
            }
           $records[] = $record;
        }
       $entityManager->flush();
       $client = self::$container->get(\JoliCode\Elastically\Client);
       $indexer = $client->getIndexer();
       foreach($dtos as $dto) {
             $indexer->scheduleIndex($indexName, new Document((string) $dto->id, $dto));
       }
       $indexer->flush();

      // make sure index and 5 documents exists
      $this->client->getIndex($indexName)
      self::assertTrue($index->exists());
        $count = 0;
        foreach ($records as $record) {
            try {
                $index->getModel($record=>id);
                $count++;
            } catch (\Elastica\Exception\NotFoundException) {
            }
        }
        self::assertEquals(5, $count);

       // run cli command to reindex all data, it should disable old index, and create new one with 10 records I have in DB
       $application = new Application(self::$kernel);
        $command = $application->find('app:elasticsearch:generate-bp-search-index');
        $commandTester = new CommandTester($command);
        $commandTester->execute([]);

       // now I want to make sure we have index with all 10 documents in it   
        $index = $this->getIndex($indexName');
        $count = 0;
        foreach ($bpIds as $bpId) {
            /** @var Dto $dto */
            $dto = $index->getModel($bpId);
            self::assertInstanceOf(Dto::class, $dto);
        }

        self::assertEquals(10, $count);
    }

So the issue is that at the end (last asserts block) it still sees an index with only 5 documents (the index which should be purged), but really on the elastic search side all works fine, it creates a new index with 10 documents and purges old one

So the question is how can I refresh the client to see that new index? I tried $this->getIndex($indexName)->refresh() also tried to generate new index using builder and got error like Index some_index_2021-08-09-093259" is already created, something is wrong. Also, I tried to use sleep(10) in case of some re-indexing delay?

damienalexandre commented 3 years ago

Could you post the code of your custom command app:elasticsearch:generate-bp-search-index?

Also, it looks like your test does not create the index with IndexBuilder before indexing.

And about Index some_index_2021-08-09-093259" is already created, this can happen when calling the index creation method two times in the same second (as we use the current time in the name).

bogdandubyk commented 3 years ago

here it is:

class CreateIndex extends Command
{
    protected static $defaultName = 'app:elasticsearch:generate-bp-search-index';

    protected function configure(): void
    {
        $this
            ->setDescription('Build new index from scratch and populate.');
    }

    public function __construct(
        private Client $client,
        private DatabaseRepository $repository,
        private MessageBusInterface $bus
    ) {
        parent::__construct();
    }

    protected function execute(InputInterface $input, OutputInterface $output): int
    {
        $indexName = 'some_index';
        $indexBuilder = $this->client->getIndexBuilder();
        $newIndex = $indexBuilder->createIndex($indexName);
        $indexer = $this->client->getIndexer();

        $records = $this->repo->getRecordsForIndex();

        foreach ($records as $record) {
            $indexer->scheduleIndex($newIndex, new Document((string) $record->id, $record));
        }

        $indexer->flush();

        $indexBuilder->markAsLive($newIndex, $indexName);
        $indexBuilder->speedUpRefresh($newIndex);
        $indexBuilder->purgeOldIndices($indexName);

        return Command::SUCCESS;
    }
}
bogdandubyk commented 3 years ago

yep, it's not using builder, should It? I mean I do not know how, but it's working like this:

      $client = self::$container->get(\JoliCode\Elastically\Client);
       $indexer = $client->getIndexer();
       foreach($dtos as $dto) {
             $indexer->scheduleIndex($indexName, new Document((string) $dto->id, $dto));
       }
       $indexer->flush();

so looks like it's creating an index without a builder, because I'm deleting all indexes between test cases

damienalexandre commented 3 years ago

Yes in Elasticsearch if you push a document and the index does not exists, it's created by default.

What I suspect is that your test create the index 'some_index' as a side effect. And when you run your command to restart / migrate this index, it can't "mark it as live" via a 'some_index' alias because the index 'some_index' is already existing.

Can you check the result of $indexBuilder->markAsLive($newIndex, $indexName); ?

bogdandubyk commented 3 years ago

@damienalexandre not sure I understand, but I change the logic of creating/seeding index in test to use builder, like this:

       $client = self::$container->get(\JoliCode\Elastically\Client);
       $indexer = $client->getIndexer();
       $indexBuilder = $client->getIndexBuilder();

       $index = $this->client->getIndex($indexName);
       if (($exists = $index->exists()) === false) {
            $index = $indexBuilder->createIndex($indexName);
       } 
       foreach($dtos as $dto) {
             $indexer->scheduleIndex($index, new Document((string) $dto->id, $dto));
       }

        if ($exists === false) {
                $indexBuilder->markAsLive($index, $indexName);
                $indexBuilder->speedUpRefresh($index);
                $indexBuilder->purgeOldIndices($indexName);
       }
       $indexer->flush();

and now it's working with an explicitly creating index through the builder. Thank you!