BenjaminBeck / bdm_kesearchindexer_flux

A KeSearch indexer for Fluid Powered TYPO3 (Extension: flux)
2 stars 0 forks source link

Doesn't index content #2

Open Chris-dev opened 6 years ago

Chris-dev commented 6 years ago

TYPO3 8.7.9 flux development (no fluidcontent) ke_search 2.6.2 (TER) bdm_kesearchindexer_flux 1.0.0 (github)

The indexer correctly identifies and lists the FCEs in the index (including page pid, FCE created / modified date), but doesn't index the actual text field with the allowKeSearchIndex variable.

My flux setup:

<flux:field.text name="settings.maincontent" label="Text" enableRichText="true"> <flux:form.variable name="allowKeSearchIndex" value="true"/> </flux:field.text>

tim-baecker commented 5 years ago

I can confirm, content is not indexed under following conditions: TYPO3 9.5.8 ke_search 3.0.4 flux 9.2.0

dgyrman commented 5 years ago

Same problem

X-Tender commented 4 years ago

Seems like it didn't even work with typo3 v9 :( Tried to fix it myself but it's over my knowledgelevel.

HappyCoding-hue commented 4 years ago

If someone who is smart enough would find the time to create/update the Indexer for Flux and TYPO3 LTS 9.5, I would really appreciate that. I couldn't manage it. Thanks guys.

dgyrman commented 4 years ago

Guys, got it. My colleague at work fixed it. Replace code in TypesFluidcontent.php

`private function getIndexableFromFluxForm(ProviderInterface $provider, Form $form, array $ttContentRow){ $indexable = ''; $flexFormValues = $provider->getFlexFormValues($ttContentRow);

    // flexform
    $flexformData = $ttContentRow['pi_flexform'];

    /* Process the data */
    $parsedData = strip_tags(           // 3. strip html?
        html_entity_decode(             // 2. decode actual content?
            strip_tags($flexformData)   // 1. strip xml?
        )
    );

    /* Replace any kind of whitespace with a single space */
    $output = preg_replace('/\s\s+/', ' ', $parsedData);

    /** @var Sheet $sheets */
    $sheets = $form->getSheets();
    foreach($sheets as $sheet){
        $fields = $sheet->getFields();
        /** @var FieldInterface $field */
        foreach($fields as $field){
            $indexable .= $flexFormValues[$field->getName()]."\r\n";        
        }
    }
    return $output;
}

public function getContentFromContentElement($ttContentRow) {
    /** @var \TYPO3\CMS\Extbase\Object\ObjectManager $objectManager */
    $objectManager = GeneralUtility::makeInstance(ObjectManager::class);
    /** @var ProviderResolver $fluxProviderResolver */
    $fluxProviderResolver = $objectManager->get(ProviderResolver::class);
    $fluxProvider = $fluxProviderResolver->resolvePrimaryConfigurationProvider('tt_content', 'pi_flexform', $ttContentRow);

    if(!is_object($fluxProvider)) return '';
    $form = $fluxProvider->getForm($ttContentRow);
    if(!is_object($form)) return '';
    // @TODO index pdf files linked with FAL ..
    $content = $this->getIndexableFromFluxForm($fluxProvider, $form, $ttContentRow);
    $content .= parent::getContentFromContentElement($ttContentRow);
    return $content;
}

/**
 * Constructor of this object
 */
public function __construct($pObj) {
     parent::__construct($pObj);
}`
kledo-34 commented 4 years ago

@FB1op3nUP Thank you so much for investing your time in this, I really wanted to use ke_search with flux, but could you explain a bit more how you got it to work.

When I want to add a Indexer Configuration record in the search data folder I still get the error 'An error occurred trying to process items for field "Type" (TYPO3 Fatal Error: Extension key "bdm_kesearchindexer_flux" is NOT loaded!).'

Before I replaced the content in ext/bdm_kesearchindexer_flux-master/classes/KeSearchIndexer/ TypesFluidcontent.php as you mentioned. (I only replaced the code from `private function getIndexableFromFluxForm' onwards. After that I installt the indexer and cleared all caches.

My TypesFluidcontent.php

file is now the following:

TypesFluidcontent.zip

`<?php

namespace BDM\BdmKesearchindexerFlux\KeSearchIndexer;

use BDM\BdmKesearchindexerFlux\Helper\FluxHelper; use FluidTYPO3\Flux\Form; use FluidTYPO3\Flux\Form\Container\Sheet; use FluidTYPO3\Flux\Form\FieldInterface; use FluidTYPO3\Flux\Provider\ProviderInterface; use FluidTYPO3\Flux\Provider\ProviderResolver; use TYPO3\CMS\Core\Utility\ExtensionManagementUtility; use TYPO3\CMS\Core\Utility\GeneralUtility; use TYPO3\CMS\Extbase\Object\ObjectManager;

require_once ( ExtensionManagementUtility::extPath( 'ke_search' ).'Classes/indexer/types/class.tx_kesearch_indexer_types_tt_content.php');

class TypesFluidcontent extends \tx_kesearch_indexer_types_tt_content{ private function getIndexableFromFluxForm(ProviderInterface $provider, Form $form, array $ttContentRow){ $indexable = ''; $flexFormValues = $provider->getFlexFormValues($ttContentRow);

// flexform
$flexformData = $ttContentRow['pi_flexform'];

/* Process the data */
$parsedData = strip_tags(           // 3. strip html?
    html_entity_decode(             // 2. decode actual content?
        strip_tags($flexformData)   // 1. strip xml?
    )
);

/* Replace any kind of whitespace with a single space */
$output = preg_replace('/\s\s+/', ' ', $parsedData);

/** @var Sheet $sheets */
$sheets = $form->getSheets();
foreach($sheets as $sheet){
    $fields = $sheet->getFields();
    /** @var FieldInterface $field */
    foreach($fields as $field){
        $indexable .= $flexFormValues[$field->getName()]."\r\n";        
    }
}
return $output;

}

public function getContentFromContentElement($ttContentRow) { /* @var \TYPO3\CMS\Extbase\Object\ObjectManager $objectManager / $objectManager = GeneralUtility::makeInstance(ObjectManager::class); /* @var ProviderResolver $fluxProviderResolver / $fluxProviderResolver = $objectManager->get(ProviderResolver::class); $fluxProvider = $fluxProviderResolver->resolvePrimaryConfigurationProvider('tt_content', 'pi_flexform', $ttContentRow);

if(!is_object($fluxProvider)) return '';
$form = $fluxProvider->getForm($ttContentRow);
if(!is_object($form)) return '';
// @TODO index pdf files linked with FAL ..
$content = $this->getIndexableFromFluxForm($fluxProvider, $form, $ttContentRow);
$content .= parent::getContentFromContentElement($ttContentRow);
return $content;

}

/**

}`

Thanks for any further advice

dgyrman commented 4 years ago

@FB1op3nUP Thank you so much for investing your time in this, I really wanted to use ke_search with flux, but could you explain a bit more how you got it to work.

When I want to add a Indexer Configuration record in the search data folder I still get the error 'An error occurred trying to process items for field "Type" (TYPO3 Fatal Error: Extension key "bdm_kesearchindexer_flux" is NOT loaded!).'

Before I replaced the content in ext/bdm_kesearchindexer_flux-master/classes/KeSearchIndexer/ TypesFluidcontent.php as you mentioned. (I only replaced the code from `private function getIndexableFromFluxForm' onwards. After that I installt the indexer and cleared all caches.

My TypesFluidcontent.php

file is now the following:

TypesFluidcontent.zip

`<?php

namespace BDM\BdmKesearchindexerFlux\KeSearchIndexer;

use BDM\BdmKesearchindexerFlux\Helper\FluxHelper; use FluidTYPO3\Flux\Form; use FluidTYPO3\Flux\Form\Container\Sheet; use FluidTYPO3\Flux\Form\FieldInterface; use FluidTYPO3\Flux\Provider\ProviderInterface; use FluidTYPO3\Flux\Provider\ProviderResolver; use TYPO3\CMS\Core\Utility\ExtensionManagementUtility; use TYPO3\CMS\Core\Utility\GeneralUtility; use TYPO3\CMS\Extbase\Object\ObjectManager;

require_once ( ExtensionManagementUtility::extPath( 'ke_search' ).'Classes/indexer/types/class.tx_kesearch_indexer_types_tt_content.php');

class TypesFluidcontent extends \tx_kesearch_indexer_types_tt_content{ private function getIndexableFromFluxForm(ProviderInterface $provider, Form $form, array $ttContentRow){ $indexable = ''; $flexFormValues = $provider->getFlexFormValues($ttContentRow);

// flexform
$flexformData = $ttContentRow['pi_flexform'];

/* Process the data */
$parsedData = strip_tags(           // 3. strip html?
    html_entity_decode(             // 2. decode actual content?
        strip_tags($flexformData)   // 1. strip xml?
    )
);

/* Replace any kind of whitespace with a single space */
$output = preg_replace('/\s\s+/', ' ', $parsedData);

/** @var Sheet $sheets */
$sheets = $form->getSheets();
foreach($sheets as $sheet){
    $fields = $sheet->getFields();
    /** @var FieldInterface $field */
    foreach($fields as $field){
        $indexable .= $flexFormValues[$field->getName()]."\r\n";        
    }
}
return $output;

}

public function getContentFromContentElement($ttContentRow) { /* @var \TYPO3\CMS\Extbase\Object\ObjectManager $objectManager / $objectManager = GeneralUtility::makeInstance(ObjectManager::class); / @var ProviderResolver $fluxProviderResolver */ $fluxProviderResolver = $objectManager->get(ProviderResolver::class); $fluxProvider = $fluxProviderResolver->resolvePrimaryConfigurationProvider('tt_content', 'pi_flexform', $ttContentRow);

if(!is_object($fluxProvider)) return '';
$form = $fluxProvider->getForm($ttContentRow);
if(!is_object($form)) return '';
// @TODO index pdf files linked with FAL ..
$content = $this->getIndexableFromFluxForm($fluxProvider, $form, $ttContentRow);
$content .= parent::getContentFromContentElement($ttContentRow);
return $content;

}

/**

  • Constructor of this object */ public function construct($pObj) { parent::construct($pObj); }

}`

Thanks for any further advice

Hi, kiedo34

The code above is parsing xml stored in pi_flexform in table "tt_content". I can't really explain why this works and the original code not, but i noticed that i forgot to add some changes.

`//require_once ( ExtensionManagementUtility::extPath( 'ke_search' ).'Classes/indexer/types/class.tx_kesearch_indexer_types_tt_content.php');

class TypesFluidcontent extends \TeaminmediasPluswerk\KeSearch\Indexer\Types\TtContent`

You can remove require_once line and change the class definiton. Ke_Search changed class names in previous updates, so i think you get an error because TypesFluidcontent.php can't find parent class. Also you should enbale some TCA fields in file Hooks/TCA/Overrides.php. Download and reinstall Extension. I didn't developed this solution, so it's maybe better you do it this way. Here is the full extension.

Hope it will work.

bdm_kesearchindexer_flux.zip

kledo-34 commented 4 years ago

Yeah, thank you so much, I got it now and I had quite a bit of time today to understand the whole thing a bit better and add some further functionality. Actually your solution now adds all flux fields to the index without exception. This indeed works, but I also wanted to get back the original behaviour to choose which fields to index. Otherwise people will get search results with a lot of settings information that I store in flux fields aswell. Therefore I changed your code in TypesFluidcontent.php again a bit closer to the original functionality. Only thing that doesn't work yet with my new configuration is indexing nested Section and Object Flux fields. So if anyone knows how to add that feel free to do so ;)

@FB1op3nUP thanks again for your solution, without that I would've not got that far

My current Version (indexes all flux fields with the variable allowKeSearchIndex, except of fields inside flux sections and objects). bdm_kesearchindexer_flux.zip

kledo-34 commented 4 years ago

Hey guys, amazing good news, I managed to add the feature I explained above, fields nested in Flux sections and objects can now be indexed aswell. I'm by far not a php developer, but as I was able to understand the code a bit better yesterday, I was able to add this feature in TypesFluidcontent.php . It basically checks if a field is a section and if so it loops through all the objects and fields and checks if a field has the allowKeSearchIndex variable. If this is the case it reads out the fields data and adds it to be indexed.

As I read through the code I recognized that the original version of BenjaminBeck didn't have this feature, at least I can't find anything into that direction, so hopefully a good new functionality ;)

The changed TypesFluidcontent.php here: TypesFluidcontent.zip

The complete Extension here: bdm_kesearchindexer_flux.zip

@FB1op3nUP Thanks again for the start and the update of the indexer. If you are using the Indexer yourself I think my added features would be useful for you aswell as choosing the flux fields with content yourself and indexing sections and objects are quite good funcionalities for common flux users :) Have fun with it.

@BenjaminBeck If you are still active I think you could update the repository now as I think the indexer works with all current versions now. (Tested Typo3 9.5.11, ke_search 3.0.6, flux 9.2.0)

dgyrman commented 4 years ago

Hey guys, amazing good news, I managed to add the feature I explained above, fields nested in Flux sections and objects can now be indexed aswell. I'm by far not a php developer, but as I was able to understand the code a bit better yesterday, I was able to add this feature in TypesFluidcontent.php . It basically checks if a field is a section and if so it loops through all the objects and fields and checks if a field has the allowKeSearchIndex variable. If this is the case it reads out the fields data and adds it to be indexed.

As I read through the code I recognized that the original version of BenjaminBeck didn't have this feature, at least I can't find anything into that direction, so hopefully a good new functionality ;)

The changed TypesFluidcontent.php here: TypesFluidcontent.zip

The complete Extension here: bdm_kesearchindexer_flux.zip

@FB1op3nUP Thanks again for the start and the update of the indexer. If you are using the Indexer yourself I think my added features would be useful for you aswell as choosing the flux fields with content yourself and indexing sections and objects are quite good funcionalities for common flux users :) Have fun with it.

@BenjaminBeck If you are still active I think you could update the repository now as I think the indexer works with all current versions now. (Tested Typo3 9.5.11, ke_search 3.0.6, flux 9.2.0)

This is really cool, thank you!

kledo-34 commented 4 years ago

Hey there, once again haha, I added a few more things,

First of all I stripped tags and whitespaces of the indexed content with some code FB1op3nUP already provided before (it was not necessary before, but is necessary for the following implementation).

The original Flux Indexer was based on the TtContent Indexer which results in each content Element being a single instance in the search index. When people search they can now find one same page multiple times for each single content Element. The page Indexer of ke_search indexes the content of one page in a single instance, then people can only find each site once if they search.

Inspired by that I added a second Indexer Type which I called 'Indexer for flux elements per page' directly under 'Indexer for flux elements' in the typo3 Indexer Configuration Type field. This Indexer adds all of your flux content elements that you defined in the field 'content element types which should be indexed' to the search indexer grouped by page. (So its simply the page indexer for Flux). In addition to that the abstract and Image functions that work with the normal Page indexer, work with the new indexer aswell, so you can set an own abstrct text or image/icon for the search result for each page.

@FB1op3nUP I think this could also be a handy feature for you, if you use the indexer yourself. Sorry for splitting things a bit up in all those comments before. If I don't encounter further errors I think now I have all features that I need ;)

Below the new Version and in short everything that is new compared to the current version 1.0.0.

Have fun with it ;) bdm_kesearchindexer_flux.zip

kledo-34 commented 4 years ago

@Chris-dev and @BenjaminBeck I think this issue can be closed now.

kledo-34 commented 4 years ago

Hey there, once again one last update ;)

I guess really last update now :)

New version: bdm_kesearchindexer_flux.zip

X-Tender commented 4 years ago

First of all, thank you for the work @kledo-34 . I noticed a small issue. When you create the Indexer all flux CEs are arred to the contenttypes textfield, also the t3 default one but there is a comma missing after the "uploads" CE string. Here's the content which I got:

text,textmedia,textpic,bullets,table,html,header,uploadsboilerplate_accordion,boilerplate_accordionitem

the comma is missing right after uploads and before boilerplate_accordion don't know if its part of the flux indexer to place this. Just wanted to tell this.

X-Tender commented 4 years ago

Could it be that <flux:form.section>elements aren't indexed? I've added the allowKeSearchIndex variable to the fields inside the <flux:form.object>element but it didn't worked out.

X-Tender commented 4 years ago

OK TIL: Don't add a pages indexer when you already use the fluidconentpages index

tpinne commented 4 years ago

https://extensions.typo3.org/extension/flux_kesearch_indexer tested with success in TYPO3 9.5.19 and flux 9.2.0.

Mini-Documentation is in the Readme at GitHub https://github.com/MamounAlsmaiel/flux_kesearch_indexer

It uses the default Page-Indexer of ke_search and uses the additionalFields Hook. Therefore you need to list your flux content elements you wish to index in the page indexer record. In addition to the TS for the extension. Once you figure that out, it works perfectly.

kledo-34 commented 3 years ago

@X-Tender Sorry that I didn't answer earlier, I was not on github for a long time. As @tpinne mentioned, you better use the new TER extension, as this is maintained at the moment and it's also out there for Typo3 10. BUT be aware, he also has the comma issue after uploads that you mentioned in my version above haha ;) Kind of the same things everywhere ;)