$200 - encrypt data - Githubissues

louis030195 commented 1 month ago

please do not start working on this task until we have some clarity on the specific subtasks to be done

can you suggest different "milestones"?

random things i think of:

encrypt data at rest and somehow still be able to query (look how supabase/pgsql does it)
encrypt data in http requests
bitwarden or crypto wallet like you need a password or fingerprint to open screenpipe
native apple, microsoft, OS encryption feature (vault etc)

any other feature that increase security (better PII removal ...)

also if anyone know how hard it would be to implement this in rust: https://github.com/mediar-ai/screenpipe/tree/main/examples/python/local-llm-pii-removal

then we can write down subtasks accordingly and start the bounty!

/bounty 200

linear[bot] commented 1 month ago

MED-177 $300 - encrypt data

algora-pbc[bot] commented 1 month ago

💎 $200 bounty • Screenpi.pe

Steps to solve:

Start working: Comment /attempt #466 with your implementation plan
Submit work: Create a pull request including /claim #466 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bounty • Share on socials

Saturn225 commented 1 month ago

@louis030195 After reviewing the Screenpipe repository and considering security requirements, here is a detailed breakdown of the proposed milestones and subtasks for encrypting data in my mind

Milestone 1: Data at Rest Encryption

Encrypt MP4 files on disk as soon as they are written using AES or similar encryption methods
Implement database encryption to secure sensitive data stored locally following examples like Supabase or PostgreSQL for queryable encrypted data
Provide a way for users to unlock/decrypt files programmatically using native OS features like Apple’s Secure Enclave or BitLocker.

Milestone 2: Encrypt Data in HTTP Requests

HTTP requests are encrypted using SSL/TLS and consider additional payload encryption for extra security
Test and validate all encrypted transmissions to ensure no data is exposed during transport.

Milestone 3: Bitwarden/Crypto Wallet-like Security

Implement a password/biometric authentication system (e.g., Apple/Windows APIs) for users to unlock encrypted data.
Design a secure key management system to store encryption keys

Milestone 4: OS-Level Encryption Integration

Use native OS encryption features like Apple Secure Enclave or BitLocker to enhance security.
Implement fallbacks for unsupported OS by utilizing Rust-based encryption libraries.

Milestone 5: PII Removal & Data Obfuscation

Integrate a PII detection and removal tool to sanitize sensitive data before storage.
Implement logging mechanisms to flag detected PII and ensure its removal.

Milestone 6: Investigate Rust for Encryption & PII Removal

Make Some research, explore and document Rust libraries for encryption e.g. RustCrypto and PII removal.
Provide a proof-of-concept implementation showing Rust’s performance benefits for these tasks.

Saturn225 commented 1 month ago

For a $300 bounty, implementing all six milestones might be too ambitious as each milestone could involve significant research, coding and testing . Maybe we can tackle other milestones in follow-up issues if considered.

After reviewing the effort involved, I propose focusing on two tasks that are feasible within the bounty scope. Here's the breakdown of tasks I am thinking to do if it aligns well

Task 1: Encrypt Data in HTTP Requests

This is as outlined above

Task 2: Investigate Rust for Encryption & PII Removal

Research Rust libraries that support encryption and PII removal.
Document the feasibility and complexity of implementing these libraries in Screenpipe.
Provide initial examples for proof-of-concept code to show how Rust could handle encryption/PII removal.

louis030195 commented 1 month ago

Here are 50 potential solutions to enhance screenpipe's security while maintaining API, file, and database access:

implement end-to-end encryption for all data
use hardware-backed key storage (e.g. TPM, Secure Enclave)
add multi-factor authentication for app access
implement certificate pinning for API connections
use encrypted databases like SQLCipher
implement file-level encryption for stored data
use secure random number generators for all crypto operations
implement robust input validation and sanitization
use prepared statements to prevent SQL injection
implement rate limiting on API endpoints
use HTTPS for all network communications
implement proper session management and token handling
use secure coding practices to prevent buffer overflows
implement secure key rotation policies
use sandboxing for plugin execution
implement secure boot and code signing
use memory encryption techniques
implement secure logging practices
use secure deletion methods for sensitive data
implement network segmentation for different components
use obfuscation techniques for sensitive code
implement secure update mechanisms
use runtime application self-protection (RASP)
implement secure inter-process communication
use trusted execution environments where available
implement secure key derivation functions
use secure random password generation
implement secure backup and recovery processes
use secure time synchronization
implement secure error handling and logging
use security headers in API responses
implement API versioning for better security control
use OAuth 2.0 or OpenID Connect for authentication
implement IP whitelisting for API access
use JSON Web Tokens (JWT) for secure data exchange
implement secure websocket connections
use content security policy (CSP) headers
implement subresource integrity checks
use HSTS to enforce HTTPS
implement API request signing
use secure random IDs for resources
implement secure file upload handling
use secure cookie attributes
implement secure cross-origin resource sharing (CORS) policies
use security scanners and static analysis tools
implement secure configuration management
use secure defaults for all settings
implement secure password reset mechanisms
use secure session storage techniques
implement secure data erasure methods for end-of-life

remember, the key is to implement these solutions in a way that balances security with usability and performance. some of these may require careful consideration and testing to ensure they don't negatively impact the user experience or system performance.

ologbonowiwi commented 4 weeks ago

A possible first step would be to encrypt the database. There are a few alternatives, like Turso, SQLCypher, or whatever we are less resistant to implementing on our codebase. We can use this keyring implementation. It's compliant with Linux, Mac, Windows, and more.

From what I see since launchbadge/sqlx#2014 we may be able to get encryption on sqlx out of the box.

My implementation plan:

on the first initialization of the app without the key (to support this on who was using screenpipe before without deleting any data), generate a safe key and use native keyrings/vaults to hold the key
use the keyring lib to fetch the key
use the to encrypt/decrypt the db (when opening/closing the connection or something)

Handling media (file encryption) and using HTTPS on the API will each require its own issue. Doing everything at once, given its uncertainty, is a too-broad scope, and it is bug-prone. Because of that, I recommend keeping separate PRs for each of these things.

If you agree with the implementation plan, the first step will be to write some integration tests, ensure the API is accessing the database, interact with the API, check the values on the database, and so on. In this way, we can change the core implementation with confidence it'll not break (and if it breaks, we'll see it quickly on the PRs). But that's not on the scope of the 200$ bounty, so let me know what you think @louis030195.

louis030195 commented 4 weeks ago

A possible first step would be to encrypt the database. There are a few alternatives, like Turso, SQLCypher, or whatever we are less resistant to implementing on our codebase. We can use this keyring implementation. It's compliant with Linux, Mac, Windows, and more.

From what I see since launchbadge/sqlx#2014 we may be able to get encryption on sqlx out of the box.

My implementation plan:

on the first initialization of the app without the key (to support this on who was using screenpipe before without deleting any data), generate a safe key and use native keyrings/vaults to hold the key

use the keyring lib to fetch the key

use the to encrypt/decrypt the db (when opening/closing the connection or something)

Handling media (file encryption) and using HTTPS on the API will each require its own issue. Doing everything at once, given its uncertainty, is a too-broad scope, and it is bug-prone. Because of that, I recommend keeping separate PRs for each of these things.

If you agree with the implementation plan, the first step will be to write some integration tests, ensure the API is accessing the database, interact with the API, check the values on the database, and so on. In this way, we can change the core implementation with confidence it'll not break (and if it breaks, we'll see it quickly on the PRs). But that's not on the scope of the 200$ bounty, so let me know what you think @louis030195.

nice, yeah, that sounds good plan, can you take notes on overhead it adds in terms of resource usage also?

louis030195 commented 3 days ago

https://github.com/microsoft/windows-rs/blob/master/crates/samples/windows/data_protection/src/main.rs

mediar-ai / screenpipe

$200 - encrypt data #466