Sanjay8602 / LLM-Resume-Analyser-using-Unify

1 stars 3 forks source link

transform into Ai based function #5

Open OscarArroyoVega opened 1 month ago

OscarArroyoVega commented 1 month ago
          good, 

not we need to link it to a button. we can transform later this traditional function in a AI function asking the llm to give us job titles suggestions given the resume (and maybe some specific context, website, document... to be more precise or updated)

Originally posted by @OscarArroyoVega in https://github.com/Sanjay8602/LLM-Resume-Analyser-using-Unify/pull/2#pullrequestreview-2090188732

Sanjay8602 commented 1 month ago

How about this code : import torch from transformers import BertTokenizer, BertModel from sklearn.metrics.pairwise import cosine_similarity

Define job titles and associated descriptions (for better context matching)

job_title_descriptions = { "Data Scientist": "Analyze and interpret complex data to help companies make decisions. Work with machine learning, data analysis, python, and statistics.", "Software Engineer": "Develop, create, and modify general computer applications software or specialized utility programs. Work on software development, programming, coding, and software architecture.",

Add more job titles and associated descriptions as needed

}

Load pre-trained BERT model and tokenizer

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertModel.from_pretrained('bert-base-uncased')

def encode_text(text): inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) return outputs.last_hidden_state.mean(dim=1).squeeze()

def suggest_job_titles(resume_text):

Encode the resume text

resume_embedding = encode_text(resume_text)

matched_job_titles = []
for job_title, description in job_title_descriptions.items():
    # Encode the job title description
    description_embedding = encode_text(description)

    # Calculate similarity
    similarity = cosine_similarity(resume_embedding.unsqueeze(0), description_embedding.unsqueeze(0)).item()

    if similarity > 0.8:  # Set a threshold for similarity
        matched_job_titles.append((job_title, similarity))

# Sort job titles by similarity score
matched_job_titles.sort(key=lambda x: x[1], reverse=True)
return [job_title for job_title, similarity in matched_job_titles]
OscarArroyoVega commented 4 weeks ago

hello @Sanjay8602

Imo is a complicated way to do it. However, do the PR, then you can change it. When context is not long we can pass it directly into the prompt. I believe an easy place to start is to create an LLMchain with a query asking for job titles based on the {resume} provided as context, check examples in the repository. Then we can add a st.slider from 1 to 20, to give the option to choose the number_of_job_titles that will be returned. the prompt will include {number_of_job_titles} to define the query.