mediar-ai / screenpipe

rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
https://screenpi.pe
MIT License
9.59k stars 557 forks source link

$200 - encrypt data #466

Open louis030195 opened 1 month ago

louis030195 commented 1 month ago

please do not start working on this task until we have some clarity on the specific subtasks to be done

can you suggest different "milestones"?

random things i think of:

any other feature that increase security (better PII removal ...)

also if anyone know how hard it would be to implement this in rust: https://github.com/mediar-ai/screenpipe/tree/main/examples/python/local-llm-pii-removal

then we can write down subtasks accordingly and start the bounty!

/bounty 200

linear[bot] commented 1 month ago

MED-177 $300 - encrypt data

algora-pbc[bot] commented 1 month ago

💎 $200 bounty • Screenpi.pe

Steps to solve:

  1. Start working: Comment /attempt #466 with your implementation plan
  2. Submit work: Create a pull request including /claim #466 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bounty • Share on socials

Saturn225 commented 1 month ago

@louis030195 After reviewing the Screenpipe repository and considering security requirements, here is a detailed breakdown of the proposed milestones and subtasks for encrypting data in my mind

Milestone 1: Data at Rest Encryption

Milestone 2: Encrypt Data in HTTP Requests

Milestone 3: Bitwarden/Crypto Wallet-like Security

Milestone 4: OS-Level Encryption Integration

Milestone 5: PII Removal & Data Obfuscation

Milestone 6: Investigate Rust for Encryption & PII Removal

Saturn225 commented 1 month ago

For a $300 bounty, implementing all six milestones might be too ambitious as each milestone could involve significant research, coding and testing . Maybe we can tackle other milestones in follow-up issues if considered.

After reviewing the effort involved, I propose focusing on two tasks that are feasible within the bounty scope. Here's the breakdown of tasks I am thinking to do if it aligns well

Task 1: Encrypt Data in HTTP Requests

This is as outlined above

Task 2: Investigate Rust for Encryption & PII Removal

louis030195 commented 1 month ago

Here are 50 potential solutions to enhance screenpipe's security while maintaining API, file, and database access:

  1. implement end-to-end encryption for all data
  2. use hardware-backed key storage (e.g. TPM, Secure Enclave)
  3. add multi-factor authentication for app access
  4. implement certificate pinning for API connections
  5. use encrypted databases like SQLCipher
  6. implement file-level encryption for stored data
  7. use secure random number generators for all crypto operations
  8. implement robust input validation and sanitization
  9. use prepared statements to prevent SQL injection
  10. implement rate limiting on API endpoints
  11. use HTTPS for all network communications
  12. implement proper session management and token handling
  13. use secure coding practices to prevent buffer overflows
  14. implement secure key rotation policies
  15. use sandboxing for plugin execution
  16. implement secure boot and code signing
  17. use memory encryption techniques
  18. implement secure logging practices
  19. use secure deletion methods for sensitive data
  20. implement network segmentation for different components
  21. use obfuscation techniques for sensitive code
  22. implement secure update mechanisms
  23. use runtime application self-protection (RASP)
  24. implement secure inter-process communication
  25. use trusted execution environments where available
  26. implement secure key derivation functions
  27. use secure random password generation
  28. implement secure backup and recovery processes
  29. use secure time synchronization
  30. implement secure error handling and logging
  31. use security headers in API responses
  32. implement API versioning for better security control
  33. use OAuth 2.0 or OpenID Connect for authentication
  34. implement IP whitelisting for API access
  35. use JSON Web Tokens (JWT) for secure data exchange
  36. implement secure websocket connections
  37. use content security policy (CSP) headers
  38. implement subresource integrity checks
  39. use HSTS to enforce HTTPS
  40. implement API request signing
  41. use secure random IDs for resources
  42. implement secure file upload handling
  43. use secure cookie attributes
  44. implement secure cross-origin resource sharing (CORS) policies
  45. use security scanners and static analysis tools
  46. implement secure configuration management
  47. use secure defaults for all settings
  48. implement secure password reset mechanisms
  49. use secure session storage techniques
  50. implement secure data erasure methods for end-of-life

remember, the key is to implement these solutions in a way that balances security with usability and performance. some of these may require careful consideration and testing to ensure they don't negatively impact the user experience or system performance.

ologbonowiwi commented 4 weeks ago

A possible first step would be to encrypt the database. There are a few alternatives, like Turso, SQLCypher, or whatever we are less resistant to implementing on our codebase. We can use this keyring implementation. It's compliant with Linux, Mac, Windows, and more.

From what I see since launchbadge/sqlx#2014 we may be able to get encryption on sqlx out of the box.

My implementation plan:

Handling media (file encryption) and using HTTPS on the API will each require its own issue. Doing everything at once, given its uncertainty, is a too-broad scope, and it is bug-prone. Because of that, I recommend keeping separate PRs for each of these things.

If you agree with the implementation plan, the first step will be to write some integration tests, ensure the API is accessing the database, interact with the API, check the values on the database, and so on. In this way, we can change the core implementation with confidence it'll not break (and if it breaks, we'll see it quickly on the PRs). But that's not on the scope of the 200$ bounty, so let me know what you think @louis030195.

louis030195 commented 4 weeks ago

A possible first step would be to encrypt the database. There are a few alternatives, like Turso, SQLCypher, or whatever we are less resistant to implementing on our codebase. We can use this keyring implementation. It's compliant with Linux, Mac, Windows, and more.

From what I see since launchbadge/sqlx#2014 we may be able to get encryption on sqlx out of the box.

My implementation plan:

  • on the first initialization of the app without the key (to support this on who was using screenpipe before without deleting any data), generate a safe key and use native keyrings/vaults to hold the key
  • use the keyring lib to fetch the key
  • use the to encrypt/decrypt the db (when opening/closing the connection or something)

Handling media (file encryption) and using HTTPS on the API will each require its own issue. Doing everything at once, given its uncertainty, is a too-broad scope, and it is bug-prone. Because of that, I recommend keeping separate PRs for each of these things.

If you agree with the implementation plan, the first step will be to write some integration tests, ensure the API is accessing the database, interact with the API, check the values on the database, and so on. In this way, we can change the core implementation with confidence it'll not break (and if it breaks, we'll see it quickly on the PRs). But that's not on the scope of the 200$ bounty, so let me know what you think @louis030195.

nice, yeah, that sounds good plan, can you take notes on overhead it adds in terms of resource usage also?

louis030195 commented 3 days ago

https://github.com/microsoft/windows-rs/blob/master/crates/samples/windows/data_protection/src/main.rs