run-llama / LlamaIndexTS

Data framework for your LLM applications. Focus on server side solution
https://ts.llamaindex.ai
MIT License
1.9k stars 355 forks source link

SentenceSplitter doesn't take care of excludedEmbedMetadataKeys #1098

Open pserrer1 opened 3 months ago

pserrer1 commented 3 months ago

The current SentenceSplitter has some issues with excludedEmbedMetadataKeys. The TextNodes created after the split containing the whole content including all metadata entries as text.

I'm not fully sure how this is supposed to work, but my naive assumption would be that MetadataAwarTextSplitter should use node.getContent(MetadataMode.NONE) instead of node.getContent(MetadataMode.ALL)

himself65 commented 3 months ago

Could you please add a test for the case? Thanks