getgrav / grav-premium-issues

Official Grav Premium Issues repository to report problems or ask questions regarding the Premium products offered.
https://getgrav.org/premium
7 stars 2 forks source link

[algolia-pro] CLI indexing. Crawl indexing get stuck at 0 and doesn't start indexing. #337

Open Sogl opened 1 year ago

Sogl commented 1 year ago

I prepared my sitemap for indexing Flex objects as described here: https://getgrav.org/premium/algolia-pro/docs/backend#crawl-page-search-intermediate

Code in my plugin:

public function onSitemapProcessed(Event $e)
{
    $sitemap = $e['sitemap'];
    $directory = $this->grav['flex']->getDirectory('therapies');
    foreach ($directory->getCollection()->filterBy(['published' => true]) as $therapy) {
        $route = "therapies/{$therapy->slug}";
        $entry = new SitemapEntry(
            Utils::url($route, true),
            date('Y-m-d', $therapy->updated_at),
            'daily',
            '1.0'
        );
        $sitemap[Utils::url($route)] = $entry;
    }
    $e['sitemap'] = $sitemap;
}

I use #therapy body selector:

image

But I can't start indexing:

% bin/plugin algolia-pro index                

Re-indexing Algolia Search
==========================

 131/131 [============================] 100% 2 secs/2 secs -- Index Config: pages | Algolia Index: pages-ru-grav
   0/157 [>---------------------------]   0% < 1 sec/< 1 sec -- Index Config: pages | Algolia Index: pages-ru-grav

  Unable to display the estimated time if the maximum number of steps is not set.  

Same index name, as you can see. In Algolia it has proper name:

image

I also tried with additional index parameter:

% bin/plugin algolia-pro index --indexes=crawl

Re-indexing Algolia Search
==========================

   0/157 [>---------------------------]   0% < 1 sec/< 1 sec -- %message%

  Unable to display the estimated time if the maximum number of steps is not set. 

The same error. What's this?

P.S. What I found about this error: https://github.com/symfony/symfony/issues/47244 https://github.com/fr05t1k/codeception-progress-reporter/issues/12

rhukster commented 1 year ago

it sounds like the problem is that the indexing get's stuck at 0 and doesn't start indexing? The message from the progerss bar about being unable to display the estimated time is not the real issue?

your issue title is confusing.

Sogl commented 1 year ago

it sounds like the problem is that the indexing get's stuck at 0 and doesn't start indexing?

Yes.

I did some debugging and found that my Flex pages simply don't open during Crawl:

image

CrawlPageSearch.php line 230 ($page is null):

$page = $pages->find($route);
...
if ($page instanceof PageInterface) {
    $this->addRecordFromResponse($page, $response, $url,$records, $status);
} else {
    $status[] = [
        'status' => 'error',
        'msg' => 'Page Not Found: ' . $route,
        'url' => $url
    ];
...

I think it's because my routes are created dynamically in my Flex plugin:

public function onPluginsInitialized(): void
{
    if (!$this->isAdmin()) {
        $this->router();
    }
}

public function router()
{
    /** @var Uri $uri */
    $uri = $this->grav['uri'];
    $route = Uri::getCurrentRoute()->getRoute();

    if (Utils::startsWith($route, '/therapies') && !Utils::contains($route, '.')) {
        $this->enable([
            'onPagesInitialized' => ['addTherapyPage', 0]
        ]);
    }
}

public function addTherapyPage()
{
    $route = Uri::getCurrentRoute()->getRoute();

    $normalized = trim($route, '/');
    if (!$normalized) {
        return;
    }

    $parts = explode('/', $normalized, 2);
    $key = array_shift($parts);
    $path = array_shift($parts);

    /** @var Pages $pages */
    $pages = $this->grav['pages'];
    if ($pages->find($route)) {
        /** @var Debugger $debugger */
        $debugger = $this->grav['debugger'];
        $debugger->addMessage("Page {$route} already exists, page cannot be added", 'error');
        return;
    }

    $flex = Grav::instance()->get('flex');
    $therapy = $flex->getObject($path, 'therapies');

    $page = $pages->find('/therapies/therapy');
    if ($page) {
        $page->id($page->modified() . md5($route));
        $page->slug(basename($route));
        $page->folder(basename($route));
        $page->route($route);
        $page->rawRoute($route);
        $page->modifyHeader('object', $path);

        if ($therapy) {
            $title = $therapy->getProperty('title');
            $page->title($title);

            $page->media($therapy->getMedia());

            $page->content($therapy->getProperty('description'));
        }

        $pages->addPage($page, $route);
    }
}

What is the best way to add such objects to the index?

rhukster commented 1 year ago

So let me get this straight.. if you go to the route /therapies/depressiva-u-devushki-33-h-let in your browser what do you get? a working page or a 404? Because the crawler is saying that page is not accessible as it's getting a 404 response code when crawling it.

Sogl commented 1 year ago

So let me get this straight.. if you go to the route /therapies/depressiva-u-devushki-33-h-let in your browser what do you get? a working page or a 404?

A working page.

It can't find my route in a routes list (/system/src/Grav/Common/Page/Pages.php):

/**
     * Find a page based on route.
     *
     * @param string $route The route of the page
     * @param bool   $all   If true, return also non-routable pages, otherwise return null if page isn't routable
     * @return PageInterface|null
     */
    public function find($route, $all = false)
    {
        $route = urldecode((string)$route);

        // Fetch page if there's a defined route to it.
        $path = $this->routes[$route] ?? null;      //HERE

Same with findSiteBasedRoute($route) check.