Issue: Setting up MongoDB and Creating Content Model
Task Description
In order to enhance the functionality of the IntranetSearch project, we need to perform the following tasks:
Setup MongoDB:
Configure MongoDB to store scraped content data.
Store the MongoDB database URL and other necessary configurations in a .env file inside the backend folder.
Ccomplete connectDB.js:
Write the code for MongoDB connection configuration in a connectDB.js file inside the configs folder in the backend.
Create Content Model:
Create a MongoDB model for storing scraped content.
This model should have the following fields:
content: To store the textual content of the scraped page.
url: To store the URL of the scraped page.
embeddings: To store a vector of 384 dimensions, which will be used for various content analysis tasks.
(For now, store some dummy data inside these for testing purposes. embedding field can be made not required)
Name the Content Model File:
Name the content model file as contentModel.js and place it inside the models folder (to be created) in the backend.
Create a CSV Export Controller:
Create a controller inside the controllers/web-crawler directory as csvSaveController.js
This controller should be responsible for converting the documents stored in the content model to a CSV file.
The CSV file should have an appropriate header as given in Contentmodel
The name of the CSV file should be provided in req.body.
get the required fields to be included in the csv files also from req.body. It is basically a enum of [content,url,embeddings]
(i.e. if the user specifies only content as required in req.body then only content column should be there in csv)
The generated CSV file has to be saved inside the data folder, which has already been created.
Update Routes:
Add appropriate routes and endpoints for the new functionalities in the routes folder.
Expected Outcome
Upon completion of these tasks, we will have MongoDB configured to store scraped content data, a content model for structured data storage, and the ability to convert content to CSV format for analysis. This will enhance our project's capabilities significantly.
Note: Please make sure to create separate commits for each task, and include relevant documentation and comments in your code. The necessary folders and file have been already created. Provide a postman collection for testing the same also.
Issue: Setting up MongoDB and Creating Content Model
Task Description
In order to enhance the functionality of the IntranetSearch project, we need to perform the following tasks:
Setup MongoDB:
.env
file inside thebackend
folder.Ccomplete
connectDB.js
:connectDB.js
file inside theconfigs
folder in the backend.Create Content Model:
content
: To store the textual content of the scraped page.url
: To store the URL of the scraped page.embeddings
: To store a vector of 384 dimensions, which will be used for various content analysis tasks. (For now, store some dummy data inside these for testing purposes.embedding
field can be made not required)Name the Content Model File:
contentModel.js
and place it inside themodels
folder (to be created) in the backend.Create a CSV Export Controller:
controllers/web-crawler
directory ascsvSaveController.js
Contentmodel
req.body
.req.body
. It is basically a enum of [content
,url
,embeddings
] (i.e. if the user specifies onlycontent
as required inreq.body
then onlycontent
column should be there in csv)data
folder, which has already been created.Update Routes:
routes
folder.Expected Outcome
Upon completion of these tasks, we will have MongoDB configured to store scraped content data, a content model for structured data storage, and the ability to convert content to CSV format for analysis. This will enhance our project's capabilities significantly.
Note: Please make sure to create separate commits for each task, and include relevant documentation and comments in your code. The necessary folders and file have been already created. Provide a postman collection for testing the same also.