SrijanShovit / HealthLearning

A repo comprising of various Machine Learning and Deep Learning projects in healthcare domain.
38 stars 53 forks source link

Ayurveda GPT #73

Closed SrijanShovit closed 5 months ago

SrijanShovit commented 5 months ago

Problem Description: The problem is to develop an Ayurveda GPT (Generative Pre-trained Transformer) application trained on concepts provided by ancient Ayurvedic sages such as Sushruta, Charak, and others.

Solution Description:

  1. Gather Data: Collect a diverse dataset of texts, scriptures, and teachings from ancient Ayurvedic texts, including those authored by Sushruta, Charak, and other sages.
  2. Preprocess Data: Clean and preprocess the collected data to remove noise, standardize text formats, and prepare it for training.
  3. Train GPT Model: Utilize the preprocessed data to train a GPT model, ensuring that it captures the nuances and intricacies of Ayurvedic principles, treatments, and philosophies.
  4. Fine-tune Model: Fine-tune the trained GPT model on specific tasks or domains within Ayurveda, such as diagnosis, treatment recommendations, herbal remedies, etc.
  5. Develop Application: Build an intuitive and user-friendly application interface that allows users to interact with the trained GPT model.
  6. Deploy Application: Deploy the Ayurveda GPT application on a suitable platform, making it accessible to users.

Alternatives Considered:

  1. Utilizing existing Ayurveda datasets: Instead of collecting data from scratch, leverage existing datasets of Ayurvedic texts and teachings.
  2. Transfer learning: Instead of training a GPT model from scratch, use transfer learning techniques to adapt pre-trained language models to the Ayurveda domain.

Additional Context: The completion of this feature will be determined by the successful development and deployment of the Ayurveda GPT application, as well as its usability and effectiveness in providing accurate and valuable insights into Ayurvedic principles and practices.

sravaniyaramasu commented 5 months ago

@SrijanShovit Can i work on this issue please, i will give my best

sanketv010 commented 5 months ago

Problem Description: The problem is to develop an Ayurveda GPT (Generative Pre-trained Transformer) application trained on concepts provided by ancient Ayurvedic sages such as Sushruta, Charak, and others.

Solution Description:

  1. Gather Data: Collect a diverse dataset of texts, scriptures, and teachings from ancient Ayurvedic texts, including those authored by Sushruta, Charak, and other sages.
  2. Preprocess Data: Clean and preprocess the collected data to remove noise, standardize text formats, and prepare it for training.
  3. Train GPT Model: Utilize the preprocessed data to train a GPT model, ensuring that it captures the nuances and intricacies of Ayurvedic principles, treatments, and philosophies.
  4. Fine-tune Model: Fine-tune the trained GPT model on specific tasks or domains within Ayurveda, such as diagnosis, treatment recommendations, herbal remedies, etc.
  5. Develop Application: Build an intuitive and user-friendly application interface that allows users to interact with the trained GPT model.
  6. Deploy Application: Deploy the Ayurveda GPT application on a suitable platform, making it accessible to users.

Alternatives Considered:

  1. Utilizing existing Ayurveda datasets: Instead of collecting data from scratch, leverage existing datasets of Ayurvedic texts and teachings.
  2. Transfer learning: Instead of training a GPT model from scratch, use transfer learning techniques to adapt pre-trained language models to the Ayurveda domain.

Additional Context: The completion of this feature will be determined by the successful development and deployment of the Ayurveda GPT application, as well as its usability and effectiveness in providing accurate and valuable insights into Ayurvedic principles and practices.

Can you assign this issue to me?

SrijanShovit commented 5 months ago

Do you guys @sanketv010 @sravaniyaramasu want to collaborate? What would be the plan of action or steps. Convey me once you guys have a chat.

sanketv010 commented 5 months ago

Do you guys @sanketv010 @sravaniyaramasu want to collaborate? What would be the plan of action or steps. Convey me once you guys have a chat.

Works I plan on making it by first collecting data of scriptures in form of pdfs then embedding the data and storing it into a vector database and then using Retrieval-Augmented Generation (RAG) to get the output of the user queries from the LLM. The technologies/frameworks i intend to use are streamlit, langchain and huggingface.

SrijanShovit commented 5 months ago

yeah the plan is right; only thing is these scriptures are not in form of copyable text pdfs as I have come across. Moreover, you can get hindi translation easily. So if a pdf(doesn't matter: hindi or english), has copyable text then it's good. Otherwise, we need to think on some other line.

sanketv010 commented 5 months ago

yeah the plan is right; only thing is these scriptures are not in form of copyable text pdfs as I have come across. Moreover, you can get hindi translation easily. So if a pdf(doesn't matter: hindi or english), has copyable text then it's good. Otherwise, we need to think on some other line.

Sure thing I got a pdf of Charaka Samhita, i'll begin working with it & we can later on add more pdfs or text files in the data folder if once found that the chatbot is working properly.

SrijanShovit commented 5 months ago

Yeah great!! Depending upon the workflow, you want multiple issues or single issue for this project? And is @sravaniyaramasu in loop?

github-actions[bot] commented 5 months ago

This issue has been automatically closed because it has been inactive for more than 7 days. If you believe this is still relevant, feel free to reopen it or create a new one. Thank you!