con / duct

A helper to run a command, capture stdout/stderr and details about running
https://pypi.org/project/con-duct/
MIT License
3 stars 2 forks source link

Change script to open usage file once and close upon completion (keep it open through execution) #208

Closed yarikoptic closed 1 month ago

yarikoptic commented 1 month ago

TODO: RF to keep the usage file open and close upon completion.

Also why we append to info file -- shouldn't it be just w as well?

TL;DR: full story on how ran into it: on reproiner we run ffmpeg under `duct`. `git annex assistant` monitors / commits / pushes changes. I just got ``` yoh@typhon:/data/repronim/reprostim-reproiner$ git annex sync (merging origin/git-annex reproiner/git-annex into git-annex...) git-annex sync will change default behavior in the future to send content to repositories that have preferred content configured. If you do not want this to send any content, use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent) commit On branch master Your branch is behind 'reproiner/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) nothing to commit, working tree clean ok pull reproiner Updating 009f7f494..c44b4e6a4 Fast-forward Videos/2024/10/2024.10.18-15.55.41.103--.mkv.duct_usage.json | 1 + 1 file changed, 1 insertion(+) create mode 100644 Videos/2024/10/2024.10.18-15.55.41.103--.mkv.duct_usage.json git-annex: The file Videos/2024/10/2024.10.18-15.55.41.103--.mkv.duct_usage.json looks like git-annex pointer file that has had other content appended to it Already up to date. ok pull origin ok ``` so we got a file `Videos/2024/10/2024.10.18-15.55.41.103--.mkv.duct_usage.json` which was annexed but also got another record written to it: ```shell yoh@typhon:/data/repronim/reprostim-reproiner$ cat Videos/2024/10/2024.10.18-15.55.41.103--.mkv.duct_usage.json /annex/objects/MD5E-s881--eb285431d379280cb90b4878aed1506e.json {"timestamp": "2024-10-18T16:33:46.586299-04:00", "num_samples": 6, "processes": {"270229": {"pcpu": 152.0, "pmem": 0.7, "rss": 242880512, "vsz": 791842816, "timestamp": "2024-10-18T16:33:46.586299-04:00", "etime": "38:05", "cmd": "RLsl ffmpeg -f alsa -ac 2 -thread_queue_size 4096 -i hw:2,1 -f v4l2 -input_format yuyv422 -framerate 60 -video_size 1920x1080 -thread_queue_size 4096 -i /dev/video2 -c:v libx264 -flush_packets 1 -preset ultrafast -crf 18 -tune zerolatency -b:v 8M -maxrate 8M -bufsize 16M -vf setpts=PTS-STARTPTS -threads 4 -acodec aac -af asetpts=PTS-STARTPTS /data/reprostim/Videos/2024/10/2024.10.18-15.55.41.103--.mkv"}}, "totals": {"pmem": 0.7, "pcpu": 152.0, "rss": 242880512, "vsz": 791842816}, "averages": {"rss": 242880512.0, "vsz": 791842816.0, "pmem": 0.7, "pcpu": 152.0, "num_samples": 6}} ``` funny thing it is kinda a "recursive" one, since that referenced inside is also a similar file (different key): ``` yoh@typhon:/data/repronim/reprostim-reproiner$ cat .git/annex/objects/vf/XG/MD5E-s881--eb285431d379280cb90b4878aed1506e.json/MD5E-s881--eb285431d379280cb90b4878aed1506e.json /annex/objects/MD5E-s881--aba0346768ab94d0c47645bb6c366ad9.json {"timestamp": "2024-10-18T16:32:46.436519-04:00", "num_samples": 6, "processes": {"270229": {"pcpu": 152.0, "pmem": 0.7, "rss": 242880512, "vsz": 791842816, "timestamp": "2024-10-18T16:32:46.436519-04:00", "etime": "37:05", "cmd": "SLsl ffmpeg -f alsa -ac 2 -thread_queue_size 4096 -i hw:2,1 -f v4l2 -input_format yuyv422 -framerate 60 -video_size 1920x1080 -thread_queue_size 4096 -i /dev/video2 -c:v libx264 -flush_packets 1 -preset ultrafast -crf 18 -tune zerolatency -b:v 8M -maxrate 8M -bufsize 16M -vf setpts=PTS-STARTPTS -threads 4 -acodec aac -af asetpts=PTS-STARTPTS /data/reprostim/Videos/2024/10/2024.10.18-15.55.41.103--.mkv"}}, "totals": {"pmem": 0.7, "pcpu": 152.0, "rss": 242880512, "vsz": 791842816}, "averages": {"rss": 242880512.0, "vsz": 791842816.0, "pmem": 0.7, "pcpu": 152.0, "num_samples": 6}} ``` and so on ``` yoh@typhon:/data/repronim/reprostim-reproiner$ cat .git/annex/objects/j9/vp/MD5E-s881--aba0346768ab94d0c47645bb6c366ad9.json/MD5E-s881--aba0346768ab94d0c47645bb6c366ad9.json /annex/objects/MD5E-s883--27aa52025c5a0e211bc98a11a1230906.json {"timestamp": "2024-10-18T16:31:46.269211-04:00", "num_samples": 6, "processes": {"270229": {"pcpu": 152.0, "pmem": 0.7, "rss": 242880512, "vsz": 791842816, "timestamp": "2024-10-18T16:31:46.269211-04:00", "etime": "36:05", "cmd": "SLsl ffmpeg -f alsa -ac 2 -thread_queue_size 4096 -i hw:2,1 -f v4l2 -input_format yuyv422 -framerate 60 -video_size 1920x1080 -thread_queue_size 4096 -i /dev/video2 -c:v libx264 -flush_packets 1 -preset ultrafast -crf 18 -tune zerolatency -b:v 8M -maxrate 8M -bufsize 16M -vf setpts=PTS-STARTPTS -threads 4 -acodec aac -af asetpts=PTS-STARTPTS /data/reprostim/Videos/2024/10/2024.10.18-15.55.41.103--.mkv"}}, "totals": {"pmem": 0.7, "pcpu": 152.0, "rss": 242880512, "vsz": 791842816}, "averages": {"rss": 242880512.0, "vsz": 791842816.0, "pmem": 0.7, "pcpu": 152.0, "num_samples": 6}} ``` and we have a good number of such files ``` yoh@typhon:/data/repronim/reprostim-reproiner$ find .git/annex -iname *json | grep -e "-s8...*json/" | nl | tail 47 .git/annex/objects/PP/mj/MD5E-s805--9de925a22229846c1d2fc0e093a1ac52.json/MD5E-s805--9de925a22229846c1d2fc0e093a1ac52.json 48 .git/annex/objects/x2/z6/MD5E-s805--75193845ecae9932c19e20a3f82da953.json/MD5E-s805--75193845ecae9932c19e20a3f82da953.json 49 .git/annex/objects/7Z/WX/MD5E-s805--677e8e80b74fd4630eab59c1e282ef88.json/MD5E-s805--677e8e80b74fd4630eab59c1e282ef88.json 50 .git/annex/objects/F3/Z8/MD5E-s805--6e7f52009527f012ecc8a7cb357ff81c.json/MD5E-s805--6e7f52009527f012ecc8a7cb357ff81c.json 51 .git/annex/objects/zK/P1/MD5E-s802--7b4a1c02346b735e5b6a1013b10efdc7.json/MD5E-s802--7b4a1c02346b735e5b6a1013b10efdc7.json 52 .git/annex/objects/6X/Pp/MD5E-s805--5b56dd90579069200ad0c9aa75497179.json/MD5E-s805--5b56dd90579069200ad0c9aa75497179.json 53 .git/annex/objects/ZV/f1/MD5E-s805--ccb3cc51083ac83b432285fb49efb0d3.json/MD5E-s805--ccb3cc51083ac83b432285fb49efb0d3.json 54 .git/annex/objects/vQ/6g/MD5E-s811--28942c25508ff0d5b458f429fcd9315b.json/MD5E-s811--28942c25508ff0d5b458f429fcd9315b.json 55 .git/annex/objects/Q9/1M/MD5E-s804--b777a1cf08bb6b673c794f97889e7be4.json/MD5E-s804--b777a1cf08bb6b673c794f97889e7be4.json 56 .git/annex/objects/fm/3w/MD5E-s881--dac95d73eaf2af163d98a7b914784807.json/MD5E-s881--dac95d73eaf2af163d98a7b914784807.json ``` I guess happened before we switched adding those files to git not git-annex.... I see this potentially happening whenever we keep reopening/appending/closing to a file instead of just keeping it open and wirting and then closing... yeap: ```shell ❯ git grep '"a"' src/con_duct/__main__.py: with open(self.log_paths.usage, "a") as resource_statistics_log: src/con_duct/__main__.py: with open(log_paths.info, "a") as system_logs: ```
yarikoptic commented 1 month ago

oh, situation is worse as it actually currently leading to breeding those commits as we are speaking on reproiner . Will provide a PR shortly...

asmacdo commented 1 month ago

Summarizing the situation to ensure I understand correctly, the feature request here is that duct should be robust datalad save during a duct run that includes the duct output files.

yarikoptic commented 1 month ago

no. It should not do open/close/open files cycle. It should open and close only upon completion. Adding TODO in the initial PR -- came to realize that there is more than just usage file