elastic / crawler

Other
116 stars 8 forks source link

ML inference to ingested document is not working #56

Open mikecalizo opened 4 months ago

mikecalizo commented 4 months ago

Bug Description

To Reproduce

Steps to reproduce the behavior: enable pipeline params:

pipeline_enabled: true pipeline_params: _reduce_whitespace: true _run_ml_inference: true _extract_binary_content: true bulk_api: max_items: 10 max_size_bytes: 1_048_576

Expected behavior

ML inference enabled in the ingested document.

Screenshots

Environment

open crawler running locally - Elasticsearch in ESS

seanstory commented 4 months ago

Hi @mikecalizo ! Thanks for filing. I'll leave this open because we could improve the experience here.

But I'm 99% sure that the reason this isn't working for you is because _run_ml_inference: true isn't magic, it just provides a parameter to the pipeline you're running. If your pipeline doesn't have an inference processor in it, or doesn't use the _run_ml_inference param, this won't do much for you. The default pipeline is ent-search-generic-ingestion, which doesn't have an inference processor in it. If you change the pipeline: config in that YAML and specify a pipeline that uses an inference processor, you should be all set.