Open p-i- opened 1 month ago
If you could just hook the did_complete
of the Dictation tool and use AI to post-process and re-render the affected text, maybe this would do the job. If that's possible...
Here's an example of the duplicate-text bug.
I'm speaking test 123
optionally followed by full stop
or new paragraph
and then hitting BACKSPACE or ENTER, or LEFT-ARROW, or 'a' or pretty much anything it seems.
It seems that if I don't allow enough silence for it to 'settle down' after I've said 'full stop', the utterance text gets double-injected into the window.
In TextEdit I can't replicate this particular fail. It isn't 100% right there either. It is inserting unwanted newline characters.
https://github.com/microsoft/vscode/assets/693495/4b517d5c-6938-4ae3-8830-4c1e4b64a1ca
Here's a demo of the wordwrap + superposition issue:
https://github.com/microsoft/vscode/assets/693495/cb098b5f-e6d6-492e-af7c-7917ce421d30
Here's an example of the Capitilization-of-start-of-new-phrase problem:
There are other situations where I get a Capitalization fail, e.g. inserting the cursor into a sentence and speaking.
https://github.com/microsoft/vscode/assets/693495/9fafdb22-f227-4c10-baac-dd838215d283
This one is probably a really tricky fix, as macOS dictation assistant is clearly scraping the text for the active window and operating over that.
I think a VS code native speech tool would be a much appreciated feature!
Here's a nice repeatable minimal testcase for duplication.
All I do here is double-tap Fn
to invoke the macOS speech-to-text assistant and speak "Test 123" followed by a couple of seconds of silence followed by "New paragraph".
And then I just wait.
Firstly it DOESN'T create a new paragraph, just a couple of spaces. Secondly, once it times out it dumps a duplicate of the utterance.
https://github.com/microsoft/vscode/assets/693495/6b58590b-aa13-47f7-b766-4f9547a9c6f2
Type: Bug
Just try using the MacOS inbuilt Dictation tool in VSCode.
(This tool can be activated under SystemSettings -> Keyboard -> Dictation).
Many problems:
I think that the fundamental problem here is with this MacOS tool. I think it's design is overly complex and intricate, and it often falls over.
Given that most VS Code users spend most of their day entering text into VSCode, it would be really nice to have a solution that takes care of SpeechToText. Maybe a fix to interop with this Dictation tool, maybe an extension, maybe a VSCode core functionality.
I'm not bothered about speech-to-code. I'm quite happy to type my code. but if I am editing text files (.txt, .md, .nt, etc.) or modifying text content within the code (e.g. AI prompts, docstrings, strings, comments, etc.) I would like something simple and reliable.
VS Code version: Code 1.89.1 (dc96b837cf6bb4af9cd736aa3af08cf8279f7685, 2024-05-07T05:14:32.757Z) OS version: Darwin arm64 23.4.0 Modes:
System Info
|Item|Value| |---|---| |CPUs|Apple M2 (8 x 24)| |GPU Status|2d_canvas: enabledcanvas_oop_rasterization: enabled_on
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
video_decode: enabled
video_encode: enabled
webgl: enabled
webgl2: enabled
webgpu: enabled| |Load (avg)|2, 2, 2| |Memory (System)|24.00GB (2.49GB free)| |Process Argv|--crash-reporter-id f10d97cd-2115-4dba-a34a-07be9312995a| |Screen Reader|no| |VM|0%|
Extensions (21)
Extension|Author (truncated)|Version ---|---|--- dvt-remote-ssh|ami|1.0.0 nestedtext|bma|2.0.0 githistory|don|0.6.20 copilot|Git|1.194.886 copilot-chat|Git|0.15.2024043005 vsc-python-indent|Kev|1.18.0 rainbow-csv|mec|3.11.0 vscode-docker|ms-|1.29.1 debugpy|ms-|2024.6.0 python|ms-|2024.6.0 vscode-pylance|ms-|2024.5.1 jupyter|ms-|2024.4.0 jupyter-keymap|ms-|1.1.2 jupyter-renderers|ms-|1.0.17 vscode-jupyter-cell-tags|ms-|0.1.9 vscode-jupyter-slideshow|ms-|0.1.6 remote-containers|ms-|0.362.0 remote-ssh|ms-|0.110.1 remote-ssh-edit|ms-|0.86.0 remote-explorer|ms-|0.4.3 vscode-speech|ms-|0.8.0 (1 theme extensions excluded)A/B Experiments
``` vsliv368cf:30146710 vspor879:30202332 vspor708:30202333 vspor363:30204092 tftest:31042121 vstes627:30244334 vscorecescf:30445987 vscod805cf:30301675 binariesv615:30325510 vsaa593cf:30376535 py29gd2263:31024239 vscaac:30438847 c4g48928:30535728 azure-dev_surveyone:30548225 2i9eh265:30646982 962ge761:30959799 pythongtdpath:30769146 welcomedialog:30910333 pythonidxpt:30866567 pythonnoceb:30805159 asynctok:30898717 pythontestfixt:30902429 pythonregdiag2:30936856 pythonmypyd1:30879173 pythoncet0:30885854 2e7ec940:31000449 pythontbext0:30879054 accentitlementst:30995554 dsvsc016:30899300 dsvsc017:30899301 dsvsc018:30899302 cppperfnew:31000557 dsvsc020:30976470 pythonait:31006305 chatpanelt:31048053 dsvsc021:30996838 jg8ic977:31013176 pythoncenvptcf:31049071 a69g1124:31046351 pythonprc:31047982 dwnewjupytercf:31046870 26j00206:31048877 ```