elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
844 stars 24.8k forks source link

Synthetic Source #86603

Open nik9000 opened 2 years ago

nik9000 commented 2 years ago

This shrinks the index by implementing a "synthetic" _source field. Instead of saving the field to disk we reconstruct it on the fly using our column store, doc values.

Before removing the feature flag

Later

Much later

image

elasticmachine commented 2 years ago

Pinging @elastic/es-search (Team:Search)

elasticmachine commented 2 years ago

Pinging @elastic/es-analytics-geo (Team:Analytics)

jsoriano commented 2 years ago

@nik9000 does synthetic source leverage _source_include/_source_exclude for the fields it has to synthesize?

nik9000 commented 2 years ago

@nik9000 does synthetic source leverage _source_include/_source_exclude for the fields it has to synthesize?

It does not. There is no support at the moment for any kind of partial synthesis.

rocco8620 commented 2 years ago

Awesome feature, can't wait to have this in GA!!

Kiriakos1998 commented 1 year ago

Hello @nik9000 , can I pick some of the unchecked subtasks?

nik9000 commented 1 year ago

Hello @nik9000 , can I pick some of the unchecked subtasks?

I think all of the unchecked tasks are quick difficult to be honest. ignore_malformed are maybe easier, but I wouldn't suggest picking it up.

Also you'd need a committer buddy and I've had to move on to other tasks sadly. That might be quite difficult to find too.

iby-dev commented 7 months ago

@nik9000 does synthetic source leverage _source_include/_source_exclude for the fields it has to synthesize?

It does not. There is no support at the moment for any kind of partial synthesis.

Hi @nik9000 - just for my own clarity. You can either use mode: synthetic on its own or use the _source_include/_source_exclude ? But the two cannot be combined ? Is this correct ?

nik9000 commented 7 months ago

Hi @nik9000 - just for my own clarity. You can either use mode: synthetic on its own or use the _source_include/_source_exclude ? But the two cannot be combined ? Is this correct ?

Right. I honestly didn't know how to combine them so I just declared combining them to be incompatible.

Keep in mind synthetic source is only GA for time series indices and data streams. I've had to move on to other things but expect folks will get back to working on getting synthetic source good in more contexts at some point soon.

elasticsearchmachine commented 5 months ago

Pinging @elastic/es-storage-engine (Team:StorageEngine)