MongoDB + Routes - Githubissues

dhaneshragu commented 1 year ago

Issue: Setting up MongoDB and Creating Content Model

Task Description

In order to enhance the functionality of the IntranetSearch project, we need to perform the following tasks:

Setup MongoDB:
- Configure MongoDB to store scraped content data.
- Store the MongoDB database URL and other necessary configurations in a .env file inside the backend folder.
Ccomplete connectDB.js:
- Write the code for MongoDB connection configuration in a connectDB.js file inside the configs folder in the backend.
Create Content Model:
- Create a MongoDB model for storing scraped content.
- This model should have the following fields:
  - content: To store the textual content of the scraped page.
  - url: To store the URL of the scraped page.
  - embeddings: To store a vector of 384 dimensions, which will be used for various content analysis tasks. (For now, store some dummy data inside these for testing purposes. embedding field can be made not required)
Name the Content Model File:
- Name the content model file as contentModel.js and place it inside the models folder (to be created) in the backend.
Create a CSV Export Controller:
- Create a controller inside the controllers/web-crawler directory as csvSaveController.js
- This controller should be responsible for converting the documents stored in the content model to a CSV file.
- The CSV file should have an appropriate header as given in Contentmodel
- The name of the CSV file should be provided in req.body.
- get the required fields to be included in the csv files also from req.body. It is basically a enum of [content,url,embeddings] (i.e. if the user specifies only content as required in req.body then only content column should be there in csv)
- The generated CSV file has to be saved inside the data folder, which has already been created.
Update Routes:
- Add appropriate routes and endpoints for the new functionalities in the routes folder.

Expected Outcome

Upon completion of these tasks, we will have MongoDB configured to store scraped content data, a content model for structured data storage, and the ability to convert content to CSV format for analysis. This will enhance our project's capabilities significantly.

Note: Please make sure to create separate commits for each task, and include relevant documentation and comments in your code. The necessary folders and file have been already created. Provide a postman collection for testing the same also.

Shifat-Ali commented 1 year ago

Hey! I would like to work on this issue.

dhaneshragu commented 1 year ago

Hey! I would like to work on this issue.

@Shifat-Ali assigned this to you. make sure to read the task description carefully!

swciitg / IntranetSearch

MongoDB + Routes #6

Issue: Setting up MongoDB and Creating Content Model

Task Description

Expected Outcome