Open douglasg14b opened 5 years ago
Thanks for taking the time to write all that, greatly appreciated!
I agree, and there is actually unmerged code that solves the issue for web-based media players through the use of the audible
property on tabs: https://github.com/ActivityWatch/aw-webui/pull/85
I'm personally pretty happy with the above solution, as it creates minimal complexity and works for the (what I suspect is) the most common ways people consume video on their computers today (YouTube, Netflix, other web-based players).
I'd love to give more thorough feedback on the options you mentioned, especially tagging, but I'm really busy with exams this week so it'll have to wait. In the meanwhile, check out the discussion in #95
This is an idea aimed towards video-playing apps, which is a big part of consuming media.
Make a separate watcher for the media. The watcher could be either just for the PC media software or for both(it would get the audible property from the web watchers).
Have a white-list of apps, check if they're on the screen. To count time on the media-watcher, you just count the time the app is on display. We have the same downside that @douglasg14b mentioned, i.e What if user really pauses the app and leaves whilst app is on the screen?
The solution:
Check if the computer is asleep or not. On both, Linux, Mac and Windows, the computer will usually not go to sleep if there is media(This would work for video, not sure about purely audio) playing. One thing to further investigate is whether the media software needs to be in full screen and playing for the computer not to go to sleep. Therefore, if there is a media app on the screen, and the computer is not asleep, we would log that as playing time for the app.
Following false positives would occur:
@nicolae-stroncea
Detecting active audio alongside a list of sites/apps would bring the accuracy up to a very acceptable level in my opinion. Either of them by themselves would be too riddled with false positives to be too terribly useful. There isn't much need to go fancier than that imho. This is something I went into in the initial post.
Active audio + afk + on youtube = watching video
. It's not perfect, but much better than on youtube = watching video
. As an example, I have 4 monitors, and I almost always have something playing when working, and will often click on that to pause it then leave for a while. Leaving myself afk with an active video player that isn't playing. Or even watching Netflix, pause it and leave for a while, it's the active window, but nothing is playing.
Letting users create their own pattern matching for sites will also bring up the accuracy as it lets them add sites/applications to the list of video apps/sites.
@douglasg14b
I agree that detecting audio would be the most accurate way of doing it. As you've stated, it is a complex solution. The solution I offered was meant as a less complex alternative (i think it's fair to say it would be easier to implement with the existing code, and might have a lot fewer edge cases than if we go into monitoring hardware), but at the sake of less accuracy. It ultimately depends on the amount of effort that will be put into the feature. If monitoring hardware to detect audio successfully will take too many man-hours to implement at the time being, I think the solution I suggested is a feasible proposal.
I disagree that it would be so riddled with false positives to not be useful. youtube is an active window && computer not on standby = watching video
would be the more accurate representation. For your example, where you would watch Netflix, pause and leave for a while. Presumably, the computer would go on stand-by, at which point, the Netflix app would not be counted anymore for active media time.
I believe that relying on standby is an incorrect assumption for users of this library. How many people's devices that are not laptops go into standby within a few minutes of it being idle? Even plugged in laptops default to 30-60mins on Windows 10 in balanced mode, what about high performance mode? On desktops?. You're looking at 30m - 4h IF standby is even enabled, and their devices don't just turn off the screens and stay on. Nevermind most Linux users who probably don't use standby at all from what I've seen as it's usually not on by default for most desktop installs of common flavors
That's a lot of invalid data. If you're watching a movie, and you step away to do something (bathroom, cleaning, walk to dog, cooking, make coffee...etc) are most people actually gone long enough for their device to go into standby (60mins)?
That would also rely VERY heavily on the end users setup, which can vary wildly, especially when assuming that their power configuration is not set as the defaults. Assumptions on user device configuration shouldn't be made unless there is data to back it up. Which is why I believe it will be more inaccurate, and potentially worse than just not recording it at all as it currently does.
Thankfully this is a FOSS project, so man-hours isn't as much of a concern as if this was an in-company product with expenses and wages to worry about. It's still relevant, but at least in my projects, I don't consider time to implement as a deciding factor for features or compatibility unless a solid and usable drop-in is available.
Has anyone investigated integrating with media players through the same mechanisms as last.fm/audio scrobblers?
Another idea would be to take a screenshot and if the screen has changed, consider it active (not afk).
Has anyone investigated integrating with media players through the same mechanisms as last.fm/audio scrobblers?
@dynamiclover We have aw-watcher-spotify as an experiment https://github.com/activitywatch/aw-watcher-spotify
Another idea would be to take a screenshot and if the screen has changed, consider it active (not afk).
@jtrakk Two issues with this
@johan-bjareholt
I think that actually might be a viable solution, the issues you mentioned are solvable. I wouldn't take it off the table just yet. It's also very simple, and doesn't have a lot of complexities compared to monitoring audio.
You can probably even use something like OpenCV for this, which has a lot of utilities that make this even simpler like absdiff.
Down-scale the image, which actually does two things
Perform cheap math to check the delta from one image to another
Don't monitor in real time.
Also refer to https://stackoverflow.com/questions/4196453/simple-and-fast-method-to-compare-images-for-similarity
@douglasg14b I'd gladly help getting it to work with aw-server and the web-ui if you want to make such an watcher for ActivityWatch, we love to help anyone who wants to collect more data to activitywatch and make it possible for them to analyze it. You can even write the watcher in C# if that's the language you prefer, we have one watcher already which is written in that which you can get some inspiration from (https://github.com/LaggAt/ActivityWatchVS)
However I don't think this is something we want to ship with activitywatch by default and definitely not have turned on by default because:
I personally don't want to spend time on this because I find there to be more important things to fix currently.
This seems doable with Pillow and pyscreenshot. Something like this might work, perhaps as a third-party watcher package.
import time
import pyscreenshot
import requests
BUCKET_URL = "http://localhost:5600/api/0/buckets/screenshot-rgb"
INTERVAL = 10
requests.post(BUCKET_URL)
while True:
# Take a screenshot.
im = pyscreenshot.grab(childprocess=False)
# Get average value for each RGB channel.
rgb = im.resize((1, 1)).getpixel((0, 0))
# Post the rgb values.
requests.post(BUCKET_URL + "/heartbeat", json={"rgb": list(rgb)})
# Wait a few seconds before repeating.
time.sleep(INTERVAL)
@jtrakk Nice start, a few suggestions:
{"afk": true/false}
I would like to mention the power management tool powerdevil
from KDE can detect whether there is a video playing. But I don't know how they achieve that.
What about doing it inside of the extensions?
Proposal:
let listVideo = document.querySelectorAll('video')
if(!listVideo.paused){ videoPlaying = True }
Advantages:
Disadvantages:
I believe that since majority of media is consumed online, the advantages outweigh the disadvantages.
EDIT: further problem
EDIT 2:
A potential problem here is this would not detect content in iframes. A comparatively small (but existent) amount of media is done through iframes. Example is reuters (go watch an article, and it should pop an iframe with an embedded video). Another example is embedding youtube videos, which is also done through iframes. Maybe there are workarounds for this. The only one I found so far is checking for the 'autoplay` property, which if set, indicates video content. However, this is not foolproof.
On further analysis seems like the 'audible' property is indeed the better choice and given that it is the active tab and audio is playing it should indicate the user is watching some content
I'm currently going through my AW database reviewing all events tagged as audible: true
, and overall, all video content is tagged correctly:
There are a couple of false positives:
Since they are purely audio, it is very likely(more often than not, I would say) that a user puts on some music/podcast/radio etc, and then works outside of their computer: typing notes, cleaning, etc. So I think we could have a whitelist of these websites where we consider content as 'afk' even if they have their audio property set to true.
Found a way to do this directly with Sound Drivers using a Python library called SoundCard. This works with any type of applications, not just web browsers.
Tested successfully on both Linux (relies on PulseAudio, so should work on all distributions by default) and Windows (relies on WASAPI, works on Windows 7+).
#!/usr/bin/env python
import soundcard as sc
import numpy as np
'''Get a microphone from a speaker, not the actual microphone'''
def getMic():
mic = None
mics = sc.all_microphones(include_loopback=True)
for a_mic in mics:
if(a_mic.isloopback):
mic = a_mic
break
return mic
def checkAudio(mic):
isAudio = False
if(mic is not None):
# record 1 second
data = mic.record(samplerate=48000, numframes=48000)
isAudio = np.any(data != 0)
return (isAudio)
mic = getMic()
checkAudio(mic)
This will not work by default on MacOS because it does not provide loopback functionality.
getMic
for MacOS, and then get the mic that has SoundFlower's name.I don't have a Mac to test this, so somebody should confirm to see if this works.
@nicolae-stroncea I'll try this on my Macbook and report back shortly.
@jmealo Not sure if you already found it, but this tutorial seemed useful to me. It helps avoid some potential pitfalls of the setup, specifically that if you don't set multi-output, your Mac won't play any sound at all since all of it will be routed only to SoundFlower. It also explains how to select SoundFlower as an input device, which is what we need
I'll still test Soundflower, but, I found this: https://stackoverflow.com/questions/27604207/applescript-check-if-computer-is-playing-any-sound#27608712
When I play a YouTube video in Chrome:
pmset -g | grep coreaudiod
sleep 1 (sleep prevented by sharingd, Google Chrome, coreaudiod, useractivityd)
When I paused the video coreaudiod
stopped preventing sleep and no longer appeared in the output.
I fired up Zoom, with no meeting there was no output, upon starting a new meeting:
hibernatefile /var/vm/sleepimage
disksleep 0
sleep 1 (sleep prevented by sharingd, coreaudiod, coreaudiod)
displaysleep 2 (display sleep prevented by zoom.us)
As far as false positives go: assuming a browser extension, you can differentiate between listening to music/watching a video.
If you poll this at regular intervals, you don't have to worry about notifications much. It seems like video conferencing will prevent the display from sleeping. I can test with something that uses WebRTC and verify.
I'm providing the output of some pmset
commands that should provide information helpful for time/activity tracking:
While playing a Youtube video in Chrome:
2020-06-13 14:16:26 -0400
Assertion status system-wide:
BackgroundTask 0
ApplePushServiceTask 0
UserIsActive 1
PreventUserIdleDisplaySleep 0
PreventSystemSleep 0
ExternalMedia 0
PreventUserIdleSystemSleep 1
NetworkClientActive 0
Listed by owning process:
pid 434(sharingd): [0x0000377400018c33] 00:00:40 PreventUserIdleSystemSleep named: "Handoff"
pid 626(Google Chrome): [0x000036c100018c27] 00:03:39 NoIdleSleepAssertion named: "Playing audio"
pid 273(mds_stores): [0x0000379c000b8c46] 00:00:00 BackgroundTask named: "com.apple.metadata.mds_stores.power"
pid 198(coreaudiod): [0x0000366f000180cb] 00:05:01 PreventUserIdleSystemSleep named: "com.apple.audio.AppleHDAEngineOutput:1B,0,1,1:0.context.preventuseridlesleep"
Created for PID: 742.
pid 431(useractivityd): [0x0000379a00018c45] 00:00:01 PreventUserIdleSystemSleep named: "BTLEAdvertisement"
Timeout will fire in 58 secs Action=TimeoutActionTurnOff
pid 151(hidd): [0x0000365400098c0a] 00:00:00 UserIsActive named: "com.apple.iohideventsystem.queue.tickle serviceID:100000363 name:AppleEmbeddedKeyboa product:Apple Internal Keyb eventType:3"
Timeout will fire in 120 secs Action=TimeoutActionRelease
No kernel assertions.
Idle sleep preventers: IODisplayWrangler
While in a Zoom meeting (it looks like the developers forgot to provide the correct value for the activity):
% pmset -g assertions
2020-06-13 14:18:49 -0400
Assertion status system-wide:
BackgroundTask 0
ApplePushServiceTask 0
UserIsActive 1
PreventUserIdleDisplaySleep 1
PreventSystemSleep 0
ExternalMedia 0
InternalPreventDisplaySleep 1
PreventUserIdleSystemSleep 1
NetworkClientActive 0
Listed by owning process:
pid 26724(zoom.us): [0x0000381e00058c76] 00:00:12 NoDisplaySleepAssertion named: "Describe Activity Type"
pid 434(sharingd): [0x0000377400018c33] 00:03:02 PreventUserIdleSystemSleep named: "Handoff"
pid 106(powerd): [0x0000381600108002] 00:00:20 InternalPreventDisplaySleep named: "com.apple.powermanagement.delayDisplayOff"
Timeout will fire in 100 secs Action=TimeoutActionTurnOff
pid 431(useractivityd): [0x0000382700018c78] 00:00:03 PreventUserIdleSystemSleep named: "BTLEAdvertisement"
Timeout will fire in 56 secs Action=TimeoutActionTurnOff
pid 384(nsurlsessiond): [0x0000382800018c7a] 00:00:02 PreventUserIdleSystemSleep named: "NSURLSessionTask ADC0E368-B668-4A09-B48C-B1B11C78F152"
Timeout will fire in 10798 secs Action=TimeoutActionTurnOff
pid 384(nsurlsessiond): [0x0000382800018c7b] 00:00:02 PreventUserIdleSystemSleep named: "NSURLSessionTask B2ED8888-9B0E-4A54-9F6F-207CFA4B82A2"
Timeout will fire in 10798 secs Action=TimeoutActionTurnOff
pid 198(coreaudiod): [0x0000381f00018c5c] 00:00:11 PreventUserIdleSystemSleep named: "com.apple.audio.AppleHDAEngineOutput:1B,0,1,1:0.context.preventuseridlesleep"
Created for PID: 26724.
pid 198(coreaudiod): [0x0000381e00018c58] 00:00:12 PreventUserIdleSystemSleep named: "com.apple.audio.AppleHDAEngineInput:1B,0,1,0:1.context.preventuseridlesleep"
Created for PID: 26724.
pid 151(hidd): [0x0000365400098c0a] 00:00:00 UserIsActive named: "com.apple.iohideventsystem.queue.tickle serviceID:100000363 name:AppleEmbeddedKeyboa product:Apple Internal Keyb eventType:3"
Timeout will fire in 120 secs Action=TimeoutActionRelease
No kernel assertions.
Idle sleep preventers: IODisplayWrangler
Also found this command, which seems to draw inspiration from same source :
if [[ "$(pmset -g | grep ' sleep')" == *"coreaudiod"* ]]; then echo audio is playing; else echo no audio playing; fi
It doesn't have the same level of detail, but can give a quick, cheap check if audio is playing
@nicolae-stroncea: you can do all sorts of activity tracking beyond what you set out to do on OSX with pmset -g assertions
, you can see whether the user clicks, scrolls, touches, multi-touches, types, etc... (it logs whatever resets the idle user timeout, as well as a count down, you can infer a great deal from this). Additionally, we get verbose logging of what's keeping the system from sleeping, which includes playing audio/video or using the webcam.
I wasn't able to get your Python to run, is it Python 2? I think it's a dead-end (but good idea! especially without having access to the hardware) given what I'm able to do by tailing the power telemetry from OSX.
Using pmset
is low-overhead, can run as an unprivileged user, and doesn't require a third-party kernel extension, so it seems like the right way to approach what you set out to do (and then some!). It honestly seems like a bit of an oversight from a privacy perspective shrug.
@jmealo that's pretty neat! I imagine there's a lot of nice aw-watcher possibilities lying in there.
The script is Python3, but it would need some customizing for Mac to get it working with soundcard
:
sc.all_microphones()
. Iterate through them, then get the name of each microphone by doing the_mic.name
, to find what name SoundFlower goes by.mic = sc.get_speaker('name_of_soundflower_input')
I agree that since pmset ...
is lower overhead, it would be preferred.
I looked for a similar command that could be useful on Linux, and found: pacmd list-sink-inputs
(again dependant on the pulseaudio, and I don't think there is a lot of fragmentation on this front). You can find if any sound is running by doing: pacmd list-sink-inputs | grep -w state | grep RUNNING
. A pacmd list-sink-inputs
returns info on the application running which is useful:
index: 173
driver: <protocol-native.c>
flags: START_CORKED
state: RUNNING
sink: 1 <alsa_output.pci-0000_00_1f.3.analog-stereo>
volume: front-left: 52016 / 79% / -6.02 dB, front-right: 52016 / 79% / -6.02 dB
balance 0.00
muted: no
current latency: 89.25 ms
requested latency: 75.01 ms
sample spec: float32le 2ch 44100Hz
channel map: front-left,front-right
Stereo
resample method: copy
module: 10
client: 17 <Firefox>
properties:
media.name = "AudioStream"
application.name = "Firefox"
native-protocol.peer = "UNIX socket client"
native-protocol.version = "33"
application.process.id = "5675"
application.process.user = "nicolae"
application.process.host = "nicolae"
application.process.binary = "firefox"
application.language = "en_US.UTF-8"
window.x11.display = ":0"
application.icon_name = "firefox"
module-stream-restore.id = "sink-input-by-application-name:Firefox"
There are a couple of weird quirks that I didn't figure out about this. If I mute an application (but allow it to run), it will still show up with state: RUNNING
and muted: no
so not sure why this happens.
Wasn't able to find any similar command for Windows that we would be able to trigger directly from Python, but I'm not too familiar with developing on the platform. Worst case, soundcard could still be used for the cases where a reliable low-overhead platform-dependent command is not found.
@jmealo that's pretty neat! I imagine there's a lot of nice aw-watcher possibilities lying in there.
The script is Python3, but it would need some customizing for Mac to get it working with
soundcard
:
- Once you install SoundFlower, you would need to query all of the microphones, and find the name that MacOS uses for SoundFlower:
sc.all_microphones()
. Iterate through them, then get the name of each microphone by doingthe_mic.name
, to find what name SoundFlower goes by.- Once you get the name by looking through the mics, you can get the microphone by the name:
mic = sc.get_speaker('name_of_soundflower_input')
For what it's worth: Soundflower (2ch)
or Soundflower (64ch)
seem to be the device names.
I agree that since
pmset ...
is lower overhead, it would be preferred.I looked for a similar command that could be useful on Linux, and found:
pacmd list-sink-inputs
(again dependant on the pulseaudio, and I don't think there is a lot of fragmentation on this front). You can find if any sound is running by doing:pacmd list-sink-inputs | grep -w state | grep RUNNING
. Apacmd list-sink-inputs
returns info on the application running which is useful:index: 173 driver: <protocol-native.c> flags: START_CORKED state: RUNNING sink: 1 <alsa_output.pci-0000_00_1f.3.analog-stereo> volume: front-left: 52016 / 79% / -6.02 dB, front-right: 52016 / 79% / -6.02 dB balance 0.00 muted: no current latency: 89.25 ms requested latency: 75.01 ms sample spec: float32le 2ch 44100Hz channel map: front-left,front-right Stereo resample method: copy module: 10 client: 17 <Firefox> properties: media.name = "AudioStream" application.name = "Firefox" native-protocol.peer = "UNIX socket client" native-protocol.version = "33" application.process.id = "5675" application.process.user = "nicolae" application.process.host = "nicolae" application.process.binary = "firefox" application.language = "en_US.UTF-8" window.x11.display = ":0" application.icon_name = "firefox" module-stream-restore.id = "sink-input-by-application-name:Firefox"
There are a couple of weird quirks that I didn't figure out about this. If I mute an application (but allow it to run), it will still show up with
state: RUNNING
andmuted: no
so not sure why this happens.
What a great find! I was looking to see if systemd
had something, but, if pulseaudio can be queried directly that'd be good. I can't imagine that there's not a similar solution on any *nix based OS.
Wasn't able to find any similar command for Windows that we would be able to trigger directly from Python, but I'm not too familiar with developing on the platform. Worst case, soundcard could still be used for the cases where a reliable low-overhead platform-dependent command is not found.
It looks like powercfg
is what we're looking for on Windows. I'm hoping that read-only operations don't require elevated permissions, write ones certainly do.
It looks like the output of powercfg -requests
looks something like this (found on Microsoft answers for troubleshooting sleep issues):
SYSTEM:
[DRIVER] Cirrus Logic High Definition Audio (HDAUDIO\FUNC_01&VEN_ ...)
An audio stream is currently in use.
[PROCESS] \Device\HarddiskVolume2\Program Files (x86)\Windows Media Player\wmplayer.exe
Here's the documentation that I found for the command so far: https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/powercfg-command-line-options
I just tested on Windows:
No video/audio playing:
Microsoft Windows [Version 10.0.18363.900]
(c) 2019 Microsoft Corporation. All rights reserved.
C:\Windows\system32>powercfg -requests
DISPLAY:
None.
SYSTEM:
None.
AWAYMODE:
None.
EXECUTION:
None.
PERFBOOST:
[DRIVER] Legacy Kernel Caller
Power Manager
ACTIVELOCKSCREEN:
None.
Video playing:
C:\Windows\system32>powercfg -requests
DISPLAY:
[PROCESS] \Device\HarddiskVolume7\Program Files (x86)\Google\Chrome\Application\chrome.exe
Video Wake Lock
SYSTEM:
[DRIVER] NVIDIA High Definition Audio (HDAUDIO\FUNC_01&VEN_10DE&DEV_0072&SUBSYS_38423967&REV_1001\5&34bd84db&0&0001)
An audio stream is currently in use.
AWAYMODE:
None.
EXECUTION:
[PROCESS] \Device\HarddiskVolume7\Program Files (x86)\Google\Chrome\Application\chrome.exe
Playing audio
PERFBOOST:
None.
ACTIVELOCKSCREEN:
None.
Audio playing:
C:\Windows\system32>powercfg -requests
DISPLAY:
None.
SYSTEM:
[DRIVER] NVIDIA High Definition Audio (HDAUDIO\FUNC_01&VEN_10DE&DEV_0072&SUBSYS_38423967&REV_1001\5&34bd84db&0&0001)
An audio stream is currently in use.
AWAYMODE:
None.
EXECUTION:
[PROCESS] \Device\HarddiskVolume7\Program Files (x86)\Google\Chrome\Application\chrome.exe
Playing audio
PERFBOOST:
None.
ACTIVELOCKSCREEN:
None.
@jmealo nice find! I just tested it, and it works well. Unfortunately, it requires administrative privileges. I had to run powershell as an administrator to get it to work.
Here's the output I got when playing a youtube video for it:
DISPLAY:
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
[PROCESS] \Device\HarddiskVolume3\Program Files\Mozilla Firefox\firefox.exe
SYSTEM:
[DRIVER] Realtek Audio (INTELAUDIO\FUNC_01&VEN_10EC&DEV_0298&SUBSYS_1028087C&REV_1001\4&2223f159&2&0001)
An audio stream is currently in use.
AWAYMODE:
None.
EXECUTION:
None.
PERFBOOST:
None.
ACTIVELOCKSCREEN:
None.
Were you able to somehow do it without privileged access?
I had to run powershell as an administrator to get it to work.
:( Same here, I used an elevated cmd
. I wonder what it uses under the hood? If there's an alternative way to get log entries for this from Windows that doesn't require elevated permissions.
I was able to use this code on Windows to detect sound. Runs without any privileges.
This seems to be the meat of the code:
IMMDeviceEnumerator enumerator = (IMMDeviceEnumerator)(new MMDeviceEnumerator());
IMMDevice speakers = enumerator.GetDefaultAudioEndpoint(EDataFlow.eRender, ERole.eMultimedia);
IAudioMeterInformation meter = (IAudioMeterInformation)speakers.Activate(typeof(IAudioMeterInformation).GUID, 0, IntPtr.Zero);
float value = meter.GetPeakValue();
This seems to just be instantiating a couple of objects and then getting a peak sample value for the audio stream.
Measure-Command {[Foo.Bar]::IsWindowsPlayingSound()}
returns this:
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 491
Ticks : 4910539
TotalDays : 5.68349421296296E-06
TotalHours : 0.000136403861111111
TotalMinutes : 0.00818423166666667
TotalSeconds : 0.4910539
TotalMilliseconds : 491.0539
EDIT: This is on a i7-8750H CPU
The accepted stackoverflow answer uses Add-Type
, which seems to compile the C# to Powershell on the go. So we would likely be able to run this command directly from a Python script into Powershell. Somebody else also made the same version of the script in C++(https://github.com/smourier/IsWindowsPlayingSound), so second option would be to just execute the C++ script from Python. I think a good approach might be to just try out all 3 options (SoundCard vs C# with Powershell vs C++ script) and see which one is lightest on resources, my bet is on C++
@nicolae-stroncea: It looks like UAC eliminated any way for non-administrators to use powercfg
(the registry entries that used to work in XP don't appear to have any effect on Windows 10), so, if you went that route you'd need to request UAC permissions (or run as service?). Just for fun, you could try this: https://github.com/rootm0s/WinPwnage
So I put together a quick script which uses platform-specific commands to check for the audio. At the moment you can run it with python media_watcher.py
and it will output True/False, depending on if anything is playing
https://github.com/nicolae-stroncea/aw_audio_detector/blob/master/media_watcher.py
For Windows:
For Linux:
audible
property.For Mac:
osascript -e 'output muted of (get volume settings)'
from Python. Relevant LinkI may have missed something so any testing of these would be appreciated
While this doesn't detect video, it can detect audio within and outside the browser
@nicolae-stroncea: Should the Windows path use %SystemRoot%\system32
/ %WINDIR%\system32
or something similar to resolve where system32 is?
I tested on OSX. The script returns true
if the system is muted (I used the mute keyboard key to make sure it was a system-level mute, rather than volume: 0). It also returns true
if I mute a YouTube video in Chrome. osascript -e 'output muted of (get volume settings)'
returns true
/ false
as expected for muted system-level audio.
I did some preliminary digging and found this: https://github.com/kyleneideck/BackgroundMusic/commit/944fc112128e2e9513fea73473c347b1e5bd64f0 (this is an example where support was added for Facetime, suggesting that workarounds are required per-application) There's also source files for various media players. It doesn't appear there's an easy way to enumerate the application-level volumes of all running applications at the OS-level yet.
As a user I'd personally prefer shipping something like
(youtube|netflix|hulu|ect) is active && no keyboard/mouse movement for up to 1h30m -> notAFK
Would shipping that before the is audio playing something that is viable? I think shipping the is audio playing is an excellent idea, but this issue has been open since Jan 6th 2019.
I'm sure I'm not the first to come up with this, but what about doing away with complex heurestics and just having a switch "Consider time spent in fullscreen applications as not-afk"?
Assuming Full screen -> active
is a heuristic too, and I can see many
ways for it to fail (you watch a movie/play a game, but pause/leave the
device for a while to take a break).
I'm not even sure if it's feasible to detect if an application is full screen in a reliable and cross-platform manner.
Yup, but it's a simple one, and as I've said above, I can't really think of an instance where it would fail for me (or at least, fail worse than what we've got now), and if it does, I think I would be fine with the result.
you watch a movie/play a game, but pause/leave the device for a while to take a break
That would result in a couple minutes of wrongly tagged time, as opposed to literal hours.
Additionally, pausing an online game does vaguely match what I would consider "not-AFK" time - a fullscreen app running directly implies that attention is meant to be paid to the computer. I understand this may be a point of contention, but that's why I'm appealing to the simplicity of the heurestic. It's easy to read, since it doesn't rely on complex OS machinery relating to audio, or combinations of window titles and somewhat arbitrary (1hr30) idleness thresholds, etc.
If there's an artifact, it's likely going to be small(er) and easily recognized. And if it's an issue, well, it's a single switch after all. It can even by false by default.
I'm not even sure if it's feasible to detect if an application is fullscreen in a reliable and cross-platform manner.
Well, this thread seems to be considering metrics which so far do not seem feasible to detect even on any single operating system! :) So I didn't think this would be that much of a hurdle.
The bottom line is: This is a somewhat major issue for an activity tracking software, and has been unsolved for more than 2 years. I do appreciate its difficulty and complexity, and I don't claim that fullscreen-tracking is the final solution, but! I think it's a feasible partial solution, and if it were present, I would have simply toggled it on, and went on with my day. "Don't let perfect be the enemy of the good", yadda yadda.
but it's a simple one
I don't think it's any simpler than most of the other things already suggested (but it's a good addition still!).
That would result in a couple minutes of wrongly tagged time, as opposed to literal hours.
Everyone's usage is different. I often leave my computer with a game running for hours. I also sometimes fall asleep to a video playing. Personally, I'm not inclined to implement & maintain a feature I won't have any use of myself.
Well, this thread seems to be considering metrics which so far do not seem feasible to detect even on any single operating system! :) So I didn't think this would be that much of a hurdle.
It does! But those hurdles are exactly why it hasn't been implemented. Although I disagree that the other proposed solutions aren't feasible to detect on a single OS, and I think it's about as feasible to check if audio is playing vs if a window is fullscreen (but hard to speculate without seeing example code for the latter).
The bottom line is: This is a somewhat major issue for an activity tracking software, and has been unsolved for more than 2 years. I do appreciate its difficulty and complexity, and I don't claim that fullscreen-tracking is the final solution, but! I think it's a feasible partial solution, and if it were present, I would have simply toggled it on, and went on with my day. "Don't let perfect be the enemy of the good", yadda yadda.
100%. However, the 'good' solution that's the most likely to get implemented anytime soon is using the audible
attribute reported by aw-watcher-web (https://github.com/ActivityWatch/aw-webui/pull/85). It will only work when you use your browser for watching videos (which happens to be the case for me most of the time) and is obviously not perfect, but "Don't let perfect be the enemy of the good", yadda yadda ;)
I just merged https://github.com/ActivityWatch/aw-webui/pull/262 which implements the "audible-as-active" feature. It makes it so that if your browser is the active app, and the active browser tab is audible (playing sound), then it will not count that time as AFK (and therefore make it show on your Activity view).
It requires that you're running the web watcher for your browser.
This vastly improves the situation when you watch a video in your web browser, but it is not a complete solution, so I'll leave the issue open for now.
Will this be available in the nightly build soon? It seems that the aw-webui
module is still pinned to the commit from last December.
@archiif I just updated aw-server and aw-webui in the main activitywatch repo, should hopefully work. The recent aw-webui change however does not work on aw-server-rust yet though.
@archiif I have fixed the integrations tests now too so the nightly builds are now working again.
Is that optional? I use the Chrome plugin, and I have music (youtube video though) running pretty much all the time no matter if I'm in front of it or not.
@luckydonald Yes it's optional, you can turn it off in the settings.
Lots of programs use MPRIS for sending playback info to the system (at least on Linux, I'm not sure about other platforms) so that it can be shown in various places in the system. Just throwing it out there as a possible way to cheaply and universally check if media is playing. Example from GNOME (media playing in Firefox):
I just merged ActivityWatch/aw-webui#262 which implements the "audible-as-active" feature. It makes it so that if your browser is the active app, and the active browser tab is audible (playing sound), then it will not count that time as AFK (and therefore make it show on your Activity view).
It requires that you're running the web watcher for your browser.
This vastly improves the situation when you watch a video in your web browser, but it is not a complete solution, so I'll leave the issue open for now.
what about zoom or conference call apps. I am afraid I find zoom being AFKed most of the time. And it counts actually towards my productivity hours :disappointed:
what about zoom or conference call apps. I am afraid I find zoom being AFKed most of the time. And it counts actually towards my productivity hours 😞
I have the same issue - very happy to contribute if someone can help shape the solution. What about having an exclusion regexp for the afk watcher?
Just wanted to say I had the same issue with conferencing apps like Zoom, Teams or Skype. Is there any solution to that? Thank you.
The Problem
One of the larger time-sinks today is video. Be that through streaming services like Netflix, Amazon Prime Video, or HBO. Media sites like YouTube, Twitch, or Vimeo. Or from downloaded or streamed media on players like VLC or MPV.
Activitywatch, unfortunately, fails to effectively record the time spent on these activities as it relies on mouse and keyboard input to determine activity. This means when watching a video, Activitywatch will mark the time as afk after a short while. Even though the user is present, and spending time on an activity at their device.
This was brought up in https://github.com/ActivityWatch/activitywatch/issues/186 which was marked as
wontfix
. I believe that this something that CAN be solved, and should be seriously considered given the amount of time that can be spent consuming media.Afk time can be disabled, but this then pollutes the data. Users may go afk for a variety of times in a variety of applications or websites throughout the day, which could pollute the afk time to the point of video-specific time may no longer be useful.
Possible solutions
Note: Not all problems/disadvantages/pitfalls are meant to be solvable. I am including them for devils advocates sake and to foster a more robust discussion.
Application/Site Tagging
Compile a list of common media applications and websites. When the use goes
afk
on this site or application, mark the time as non-afk. This isn't technically tagging, but it could be setup to work in a tag-like way, which would make this into a very extendable solution for more than just videos.Advantages
Disadvantages/Pitfalls
Enhancements
These are here to try and solve for some of the problems presented under disadvantages.
Monitoring Hardware
Monitor audio output to see if a video is playing.
Advantages
Disadvantages/Pitfalls
Enhancements
General Enhancements
These are enhancements that could apply to any solution, to increase accuracy and to enable the user to correct mismatches and errors.
User-Defined Lists & Filters
Let the user create/modify the list of sites/applications, and/or the patterns used to match them.
Tagging and Pattern Matching
This goes above and beyond, but would really turn this into a much more powerful tool.
Instead of just solving the video problem. Create an extendable solution that encompass the general problem category that the video problem is part of. This would be in the form of tagging, being able to automatically tag domains & applications with predefined or user defined tags. This can be facilitated with pattern matching lists, and depending on the data's schema/format could be applied to already existing data greatly enhancing it's utility.
As an example, time in VLC, YouTube, or Netflix could be tagged as
video
, which gives users the power to filter this time separately, combine it in reports, or to more easily correct collection errors.This of course could be setup to be user-manageable, with points listed in the previous section.
Conclusion
I believe the ability to capture time spent consuming video-based media will have an impact on the future usability of this project as these sorts of services continue to expand and bring in more and more people. Solving this problem can not only provide a solution to this problem, but could also greatly enhance the utility and power of this application.
What are your thoughts? (please to not be automarking this as closed, this took some time and effort to create).