mgmgpyaesonewin / web-crawler-assignment

0 stars 0 forks source link

[Question] Security of the web application #14

Open malparty opened 3 months ago

malparty commented 3 months ago

Web applications are exposed to the Internet and all its threats. 🥷🏻 🤖 If this project was a real client project (with more time allocated), what security issues would you recommend fixing, and why? Note that this question is mainly about the code, not the infrastructure.

mgmgpyaesonewin commented 3 months ago

For security issues, I would like to work on these things.

For the Frontend Layer, I want to update for proper input validation and Sanitization to prevent the XSS injection. This one is important since keywords are going to be used for scraping as well as storing in the database.

I also want to improve the proper authorization and authentication part, verifying that each component can only access resources they are permitted to. Spider Callback should have more proper authentication and authorization control since it is especially where our crawled data are being stored in the database. I want to have more proper error handling and logging control. The admin should only be able to see the logs and failure case for crawlers and CSV upload file should be error handle properly.

malparty commented 3 months ago

I agree, with the current Spider callback (unauthenticated) and especially with the integer auto-increment primary key of user, it's quite easy to call the endpoint and insert data to existing users (or eventually to test if the users exists or not).

I would add to your suggestions: