Closed cparish312 closed 3 weeks ago
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
screenpipe | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Oct 29, 2024 11:52pm |
@louis030195 Not sure how you want to add tests for the frame writing to mp4 bit.
Also, currently you can't specify the OcrEngine used to generate the ocr_results and it just inputs the default engine
how can i test?
Working on testing now. So right now there is a foreign key constraint on the the audio_transcription
table having an audio_chunk_id in the audio_chunks
table. How would you like to handle this for adding transcriptions without associated audio_chunks
@cparish312
Working on testing now. So right now there is a foreign key constraint on the the
audio_transcription
table having an audio_chunk_id in theaudio_chunks
table. How would you like to handle this for adding transcriptions without associated audio_chunks
hmm okay so the use case is that you don't have the .mp4 audio recording to share?
like maybe someone is syncing their iphone manual recording and dont have .mp4 or lazy and we want to allow them to sync just the transcription without audio chunk
that's a bit annoying because all the code is based around this, like the search, in the UI we display result of path to video etc
dumb workaround is to generate TTS the chunk using AI XD
what are the possible solutions?
i guess we have to allow nullable and in the UI not showing any audio chunk
@louis030195 Okay cool yeah that probably makes the most sense
Oh man didn't realize how much of a pain a nullable migration is in SQLite
Yeah this is causing some issues for searching
The transcription is showing up when hitting the search endpoint, but not seeing it in the UI. Assuming this is because there is no path / audio_chunk_id. How do you want to handle the results in the UI?
Updated so it just shows "No file path available for this audio." in the app search UI when there is no audio path
Testing transcription insert:
curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{ "device_name": "MacBook Pro Microphone (input)", "content": { "content_type": "transcription", "data": { "transcription": "This is an example transcription of recorded audio.", "transcription_engine": "speech_to_text_v1" } } }'
Testing frames insert. Will need to change the file_paths to paths that exist on your computer: `curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android', 'content': {'content_type': 'frames', 'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg', 'timestamp': '2024-06-03T16:47:24.710000038Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}, {'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg', 'timestamp': '2024-06-03T16:47:22.624000072Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}]}}'
Testing frames insert. Will need to change the file_paths to paths that exist on your computer: `curl -X POST "http://localhost:3030/add" \ -H "Content-Type: application/json" \ -d '{'device_name': 'hindsight_android', 'content': {'content_type': 'frames', 'data': [{'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433244710.jpg', 'timestamp': '2024-06-03T16:47:24.710000038Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}, {'file_path': '/Users/connorparish/.hindsight_server/data/raw_screenshots/2024/06/03/com-google-android-deskclock/com-google-android-deskclock_1717433242624.jpg', 'timestamp': '2024-06-03T16:47:22.624000072Z', 'app_name': 'Clock', 'window_name': 'Clock', 'ocr_results': [], 'tags': ['hindsight', 'Clock']}]}}'
will try today
curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
"device_name": "MacBook Pro Microphone (input)",
"content": {
"content_type": "transcription",
"data": {
"transcription": "This is an example transcription of recorded audio.",
"transcription_engine": "speech_to_text_v1"
}
}
}' | jq
curl -X GET "http://localhost:3035/search?q=example&content_type=audio" -H "Content-Type: application/json" | jq
{
"data": [
{
"type": "Audio",
"content": {
"chunk_id": 8,
"transcription": "This is an example transcription of recorded audio.",
"timestamp": "2024-10-29T21:34:06.615182Z",
"file_path": "",
"offset_index": -1,
"tags": [],
"device_name": "MacBook Pro Microphone (input)",
"device_type": "Input"
}
}
],
"pagination": {
"limit": 20,
"offset": 0,
"total": 1
}
}
curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
"device_name": "macbook_pro",
"content": {
"content_type": "frames",
"data": [
{
"file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
"timestamp": "2024-03-14T16:47:24.710Z",
"app_name": "Desktop",
"window_name": "Screenshot",
"ocr_results": [],
"tags": ["screenshot", "desktop"]
},
{
"file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
"timestamp": "2024-03-14T16:47:22.624Z",
"app_name": "Desktop",
"window_name": "Screenshot",
"ocr_results": [],
"tags": ["screenshot", "desktop"]
}
]
}
}' | jq
curl -X GET "http://localhost:3035/search?window_name=screenshot&content_type=ocr&limit=1000" -H "Content-Type: application/json" | jq
{
"success": true,
"message": "Frames added successfully"
}
{
"data": [],
"pagination": {
"limit": 1000,
"offset": 0,
"total": 0
}
}
not sure i did a mistake or, expected to get the frame here
also nto seeing the merged video
(env) (base) louisbeaumont@mac:~/Documents/screen-pipe$ ls /tmp/sp/data/
Display 1 (output)_2024-10-29_21-27-01.mp4 monitor_1_2024-10-29_21-31-45.mp4 monitor_1_2024-10-29_21-38-39.mp4
MacBook Pro Microphone (input)_2024-10-29_21-27-14.mp4 monitor_1_2024-10-29_21-32-53.mp4 monitor_1_2024-10-29_21-39-49.mp4
macbook_pro_2024-10-29_21-40-21.mp4 monitor_1_2024-10-29_21-34-22.mp4 monitor_1_2024-10-29_21-41-04.mp4
macbook_pro_2024-10-29_21-43-37.mp4 monitor_1_2024-10-29_21-35-34.mp4 monitor_1_2024-10-29_21-42-15.mp4
monitor_1_2024-10-29_21-26-44.mp4 monitor_1_2024-10-29_21-37-28.mp4
also don't you have OCR?
i assume this API might be used in a very broad range of use case so should be flexible for example:
for the scope of this PR we can stick to the minimum i think, not much post processing
Yeah I agree running OCR by default when OCR results are not provided would be ideal but sounds good to add in another PR.
Are the macbook_pro
videos not the merged videos? I'm storing by "{devicename}{current_time}.mp4"
Maybe they aren't appearing in the search since there are no ocr results? Could you try putting in OCR results.
curl -X POST "http://localhost:3035/add" -H "Content-Type: application/json" -d '{
"device_name": "macbook_pro",
"content": {
"content_type": "frames",
"data": [
{
"file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/02722091-76A7-4215-9CAB-E4A4DC5A37BA.png",
"timestamp": "2024-03-14T16:47:24.710Z",
"app_name": "Desktop",
"window_name": "Screenshot",
"ocr_results": [{'text': 'test add frames with ocr results',
'text_json': '{}',
'ocr_engine': 'apple_native'}],
"tags": ["screenshot", "desktop"]
},
{
"file_path": "'$HOME'/Library/Mobile Documents/com~apple~CloudDocs/Desktop/Screenshots/0D7F899B-DE6B-494E-B70D-1F5338A54AEE.png",
"timestamp": "2024-03-14T16:47:22.624Z",
"app_name": "Desktop",
"window_name": "Screenshot",
"ocr_results": [{'text': 'test add frames with ocr results 2',
'text_json': '{}',
'ocr_engine': 'apple_native'}],
"tags": ["screenshot", "desktop"]
}
]
}
}' | jq
works!
{
"data": [
{
"type": "OCR",
"content": {
"frame_id": 1,
"text": "test add frames with ocr results",
"timestamp": "2024-03-14T16:47:24.710Z",
"file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
"offset_index": 0,
"app_name": "Desktop",
"window_name": "Screenshot",
"tags": [
"screenshot",
"desktop"
],
"frame": null
}
},
{
"type": "OCR",
"content": {
"frame_id": 2,
"text": "test add frames with ocr results 2",
"timestamp": "2024-03-14T16:47:22.624Z",
"file_path": "/tmp/spp/data/macbook_pro_2024-10-29_23-25-31.mp4",
"offset_index": 1,
"app_name": "Desktop",
"window_name": "Screenshot",
"tags": [
"screenshot",
"desktop"
],
"frame": null
}
}
],
"pagination": {
"limit": 1000,
"offset": 0,
"total": 2
}
}
@cparish312 should i merge now?
@louis030195 Did some final cleanups should be good to go!
/approve
thx!
one use case i'd want to try (would need to add a OCR option) is to create an apple shortcut to add a document into screenpipe, maybe a pdf converted to image
@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.
name: Add /add endpoint to database about: Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources
description
Creates an endpoint to add frames, ocr_results, and transcription results to the screenpipe database from outside sources
related issue: # /claim #467
type of change
checklist
additional notes
any other relevant information about the pr.