Open Canada-wet opened 1 year ago
why is purpose of this statment, metadata_input = [metadatas[i]]*len(texts_temp)?
why is purpose of this statment, metadata_input = [metadatas[i]]*len(texts_temp)?
So when we split the texts, we also need to duplicate the corresponding metadata to ensure they still match each other for FAISS vectorDB creation.
e.g.
initial_text = 'I love watching YouTube videos. I am also a YouTuber myself.' initial_metadata = [{'source':'random_blog1', 'page': 6}]
split_text = ['I love watching YouTube videos.', 'I am also a YouTuber myself.'] split_metadata = [{'source':'random_blog1', 'page': 6}, {'source':'random_blog1', 'page': 6}]
Hi thanks for the brilliant video, just saw a comment asking for metadata like pages for similarity search. I played around it and made a bit changes, and this works for me. Please check if this can help