text_embedder.py: Modified the clean_text function to improve the way text is truncated when it exceeds the token limit. Instead of removing one character at a time, the function now removes a variable number of characters, starting with one and doubling every five iterations, up to a maximum of 100 characters per iteration. This change should make the truncation process more efficient for long texts.
Text processing improvement:
text_embedder.py
: Modified theclean_text
function to improve the way text is truncated when it exceeds the token limit. Instead of removing one character at a time, the function now removes a variable number of characters, starting with one and doubling every five iterations, up to a maximum of 100 characters per iteration. This change should make the truncation process more efficient for long texts.