Namit2111 / bible-verse-finder

https://bible-verse-finder.vercel.app
GNU General Public License v3.0
24 stars 32 forks source link

Entire bible parsed for clustering #41

Closed rabelmervin closed 1 month ago

rabelmervin commented 1 month ago

Description

This PR updates the project to use the entire Bible for clustering, improving the accuracy and reducing the number of 0% similarity results. Additionally, the data storage has been reorganized, so each book stores its own chapters, which store their own scriptures. The printing of clusters now includes the book name with the verse number (e.g., Matthew 5:38), replacing the manual concatenation of 1 John.

Related Issues

Fixes #123 (Bible clustering issue) Related to #124 (Data storage improvement)

Changes List

✅Updated the clustering to use the entire Bible instead of 1 John.

✅Reorganized data storage for the Gutenberg Bible, structuring by books, chapters, and scriptures.

✅ Modified cluster printing to include the book name attached to the verse number.

Type of Changes

✅Bug fix (fixes an existing issue)

✅Enhancement (improves or changes existing functionality)

Checklist

✅ My code follows the style guidelines of this project.

✅ I have performed a self-review of my code.

✅ I have commented my code, particularly in hard-to-understand areas.

✅ I have made corresponding changes to the issue .

✅ New and existing unit tests pass locally with my changes.

vercel[bot] commented 1 month ago

@rabelmervin is attempting to deploy a commit to the namit2111's projects Team on Vercel.

A member of the Team first needs to authorize it.

JustinhSE commented 1 month ago

@rabelmervin can you show that on your local host this outputs the correct response?

rabelmervin commented 1 month ago

Sure @JustinhSE Screenshot (54)

JustinhSE commented 1 month ago

@rabelmervin no I meant can you send a screenshot of what your changes look like on your local host

JustinhSE commented 1 month ago

Like the output similar to what a user would see

rabelmervin commented 1 month ago

@JustinhSE cant able to run could you please tell me how to run it without error ?

Screenshot (55) Screenshot (56)

JustinhSE commented 1 month ago

@rabelmervin review the comments here first and see if it resolves......but the debugging is for you to do as you are making the changes.

JustinhSE commented 4 weeks ago

@rabelmervin the deadline for this issue to be eligible for a badge is coming up in the next few days. Feel free to work on this and open another PR

rabelmervin commented 4 weeks ago

Sure ,Iam happy to work on this issue sir @JustinhSE . Could you be please guide me more ?

JustinhSE commented 4 weeks ago

Unfortunately not, although I lead along with Namit, these issues are supposed to be completed by the user. You should be asking your teammate for help as well

rabelmervin commented 3 weeks ago

hi @JustinhSE @Namit2111 I ran it on local host what you think about this ? Screenshot (1)

JustinhSE commented 3 weeks ago

So the only thing is, that doesn’t print the book name before the chapter and verse. That’s the only change needed @rabelmervin

rabelmervin commented 3 weeks ago

hi @JustinhSE, @Namit2111 I think now its alright !. Your thoughts ? Screenshot (2)

JustinhSE commented 3 weeks ago

yes and no. Yes but why do some not show the #:# @rabelmervin ?

rabelmervin commented 3 weeks ago

yes and no. Yes but why do some not show the #:# @rabelmervin ?

Hi @JustinhSE The problem occurs because, each verses are seperated by \n but, also In some verses within a verse there is a line \n

Screenshot (3)

rabelmervin commented 2 weeks ago

hi @JustinhSE, what you think about the problem ? can i make pr ?

JustinhSE commented 2 weeks ago

Sorry missed your message @rabelmervin . Yes but I will be looking for alternatives then to parse the bible

rabelmervin commented 2 weeks ago

Hi @JustinhSE is there any effective way to implement it ?

JustinhSE commented 2 weeks ago

@rabelmervin right now, I don’t think so…. The only way forward I can think of is fetching the Bible or downloading a csv file and storing that… tbd on this

JustinhSE commented 2 weeks ago

@rabelmervin so I just uploaded a version of the complete bible broken down by book, chapter and verses. check backend/utils/bible.json and see if you could potentially use that instead. It is more clean and concise and retrieving verses can be easier.

JustinhSE commented 2 weeks ago

@rabelmervin try an alteration of this code so that it factors well into our code

from sentence_transformers import SentenceTransformer, util
import json
import heapq

# Load the pre-trained BERT model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Load the Bible JSON data
with open('bible.json', 'r') as file:
    bible_data = json.load(file)

# Extract all verses and their information
verses = []
verse_info = []
for verse in bible_data['verses']:
    verses.append(verse['text'])
    verse_info.append({
        'book_name': verse['book_name'],
        'chapter': verse['chapter'],
        'verse': verse['verse'],
        'text': verse['text']
    })

# Encode all verses (this step might take some time)
verse_embeddings = model.encode(verses, convert_to_tensor=True)

def find_similar_verses(theme, top_k=20):
    # Encode the input theme
    theme_embedding = model.encode(theme, convert_to_tensor=True)

    # Calculate cosine similarities
    similarities = util.pytorch_cos_sim(theme_embedding, verse_embeddings)[0]

    # Get top-k similar verses
    top_results = heapq.nlargest(top_k, enumerate(similarities), key=lambda x: x[1])

    results = []
    for idx, score in top_results:
        verse_data = verse_info[idx]
        result = {
            'reference': f"{verse_data['book_name']}, {verse_data['chapter']}:{verse_data['verse']}",
            'text': verse_data['text'],
            'score': float(score),
            'book_name': verse_data['book_name'],
            'chapter': verse_data['chapter'],
            'verse': verse_data['verse']
        }
        results.append(result)

    return results
JustinhSE commented 2 weeks ago

must do this tho pip install sentence-transformers

rabelmervin commented 2 weeks ago

Thanks @JustinhSE excited to work on this!

JustinhSE commented 1 week ago

@rabelmervin any updates??

rabelmervin commented 1 week ago

yeah @JustinhSE i started working on it

JustinhSE commented 6 days ago

please let me know when you are done as #60 needs this issue merged

rabelmervin commented 6 days ago

Sure @JustinhSE

rabelmervin commented 4 days ago

Good morning @JustinhSE I think its working pretty well now ! Screenshot (8) Screenshot (7)

JustinhSE commented 4 days ago

@rabelmervin thanks yea open a PR but we def need to do #56

JustinhSE commented 4 days ago

Also run the front end and make sure it works there too

rabelmervin commented 3 days ago

hi @JustinhSE backend works well but, when i entered the theme in frontend its "failed to fetch data" what do you think ? Screenshot (9)

JustinhSE commented 3 days ago

Yea that means it’s not completely working

JustinhSE commented 3 days ago

I’ll take a look today if I have time