Paper: Voice Computing with Python in Jupyter Notebooks

Directory	Preview	Checks	Updated (UTC)
papers/blaine_mooers	🔍 Inspect	✅ 28 checks passed (1 optional)	Jul 13, 2024, 12:28 PM

@.**

Thank you for the very thoughtful review! I will do my best to implement your suggestions.

On Tue, Jul 2, 2024 at 1:31 PM Bobby Jackson @.***> wrote:

@.**** commented on this pull request.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649137878 :

@@ -0,0 +1,443 @@ +--- +# Voice Computing with Python in Jupyter Notebooks +title: Voice Computing with Python in Jupyter Notebooks +abstract: |

Jupyter is a popular platform for writing literate programming documents that contain computer code and its output interleaved with prose that describes the code and the output. It is possible to use one's voice to interact with Jupyter notebooks. This capability opens up access to those with impaired use of their hands. Voice computing also increases the productivity of workers who are tired of typing, and increases the productivity of those workers who speak faster than they can type. Voice computing can be divided into three activities: speech-to-text, speech-to-command, and speech-to-code. Several automated speech recognition software packages operate on Jupyter notebooks and support these three activities. We will provide examples of these activities as they pertain to applications of Python in our research on the molecular structures of proteins and nucleic acids important in medicine. Several software tools at MooersLab on GitHub facilitate the use of voice computing software in Jupyter.

I would recommend rephrasing the sixth sentence. I did not see anything in here with regards to nucleic acids and proteins, so I would make this sentence more general, since you show an example of an equation being rendered.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649139999 :

+abstract: |

Jupyter is a popular platform for writing literate programming documents that contain computer code and its output interleaved with prose that describes the code and the output. It is possible to use one's voice to interact with Jupyter notebooks. This capability opens up access to those with impaired use of their hands. Voice computing also increases the productivity of workers who are tired of typing, and increases the productivity of those workers who speak faster than they can type. Voice computing can be divided into three activities: speech-to-text, speech-to-command, and speech-to-code. Several automated speech recognition software packages operate on Jupyter notebooks and support these three activities. We will provide examples of these activities as they pertain to applications of Python in our research on the molecular structures of proteins and nucleic acids important in medicine. Several software tools at MooersLab on GitHub facilitate the use of voice computing software in Jupyter. +---

+## Introduction

+Jupyter notebooks provide a highly interactive computing environment where users run Markdown and code cells to yield almost instant results. +This form of interactive computing provides the instant gratification of seeing the results of the cells' execution; this might be why Jupyter is so popular for data analysis @.***. +The most popular modality for interacting with the Jupyter notebooks is to use the keyboard and the computer mouse. +However, there are opportunities to generate prose and code using one's voice instead of one's hands. +While those who have lost use of their hands must rely solely on their voice, other users can enhance their prose generation with their voice to boost their productivity and give their hands a rest when fatigued from typing. This inclusive approach enhances the Jupyter notebook experience for all users, regardless of their physical abilities. +In other words, most users can use their voices to complement their keyboard use. +For example, dictating prose in Markdown cells is an obvious application of voice computing in Jupyter. +The ease of generating prose via speech can promote more complete descriptions of the computations executed in adjacent code cells.

How easy is the installation for someone who cannot use their hands to type? Have you experimented with using voice commands for the conda install?

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649141855 :

+While those who have lost use of their hands must rely solely on their voice, other users can enhance their prose generation with their voice to boost their productivity and give their hands a rest when fatigued from typing. This inclusive approach enhances the Jupyter notebook experience for all users, regardless of their physical abilities. +In other words, most users can use their voices to complement their keyboard use. +For example, dictating prose in Markdown cells is an obvious application of voice computing in Jupyter. +The ease of generating prose via speech can promote more complete descriptions of the computations executed in adjacent code cells. + +Some Speech-to-text software also supports mapping a word or phrase to a text replacement; there are many ways of exploiting text replacements in Markdown and code cells. +For Markdown cells, we have mapped the English contractions to their expansions, so whenever we say a contraction, the expansion automatically replaces the contraction. +This automation significantly reduces the need for manual editing, saving you valuable time and effort. By leveraging voice commands and text replacements, you can streamline your workflow and focus on the more critical aspects of your work. +Another class of text replacements is the expansion of acronyms into the phrase they represent. +The BibTeX cite keys for standard references can also be mapped to a command like cite key for scipy. +Equations type set in LaTeX for rendering with MathJaX can be mapped to commands like inline pythargeous theorem and display electron density equation, depending on whether the equation is to be in-line in a sentence or centered in display-mode. +We also mapped voice commands to tables, templates, and software licenses. +For Jupyter code cells, we mapped voice commands to chunks of code of various sizes. +In analogy to tab triggers with conventional tab-triggered snippets in advanced text editors, we call these voice commands that trigger a text replacement voice triggers. + +To facilitate voice commands in Jupyter notebook cells, we have developed sets of voice-triggered snippets for use in Markdown or code cells.

suggest adding "Python" before code

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649142249 :

+The ease of generating prose via speech can promote more complete descriptions of the computations executed in adjacent code cells. + +Some Speech-to-text software also supports mapping a word or phrase to a text replacement; there are many ways of exploiting text replacements in Markdown and code cells. +For Markdown cells, we have mapped the English contractions to their expansions, so whenever we say a contraction, the expansion automatically replaces the contraction. +This automation significantly reduces the need for manual editing, saving you valuable time and effort. By leveraging voice commands and text replacements, you can streamline your workflow and focus on the more critical aspects of your work. +Another class of text replacements is the expansion of acronyms into the phrase they represent. +The BibTeX cite keys for standard references can also be mapped to a command like cite key for scipy. +Equations type set in LaTeX for rendering with MathJaX can be mapped to commands like inline pythargeous theorem and display electron density equation, depending on whether the equation is to be in-line in a sentence or centered in display-mode. +We also mapped voice commands to tables, templates, and software licenses. +For Jupyter code cells, we mapped voice commands to chunks of code of various sizes. +In analogy to tab triggers with conventional tab-triggered snippets in advanced text editors, we call these voice commands that trigger a text replacement voice triggers. + +To facilitate voice commands in Jupyter notebook cells, we have developed sets of voice-triggered snippets for use in Markdown or code cells. +We are building on our prior experience with tab-triggered code snippets in text editors @. and domain-specific code snippet libraries for Jupyter @.. +We have made libraries of these voice-triggered snippets for several of the popular modules of the scientific computing stack for Python. +Although the Jupyter environment supports polyglot programming, we have restricted our focus to Python and Markdown.

Suggest removing and incorporating the previous suggestion to make the paper more concise.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649143836 :

+Some Speech-to-text software also supports mapping a word or phrase to a text replacement; there are many ways of exploiting text replacements in Markdown and code cells. +For Markdown cells, we have mapped the English contractions to their expansions, so whenever we say a contraction, the expansion automatically replaces the contraction. +This automation significantly reduces the need for manual editing, saving you valuable time and effort. By leveraging voice commands and text replacements, you can streamline your workflow and focus on the more critical aspects of your work. +Another class of text replacements is the expansion of acronyms into the phrase they represent. +The BibTeX cite keys for standard references can also be mapped to a command like cite key for scipy. +Equations type set in LaTeX for rendering with MathJaX can be mapped to commands like inline pythargeous theorem and display electron density equation, depending on whether the equation is to be in-line in a sentence or centered in display-mode. +We also mapped voice commands to tables, templates, and software licenses. +For Jupyter code cells, we mapped voice commands to chunks of code of various sizes. +In analogy to tab triggers with conventional tab-triggered snippets in advanced text editors, we call these voice commands that trigger a text replacement voice triggers. + +To facilitate voice commands in Jupyter notebook cells, we have developed sets of voice-triggered snippets for use in Markdown or code cells. +We are building on our prior experience with tab-triggered code snippets in text editors @. and domain-specific code snippet libraries for Jupyter @.. +We have made libraries of these voice-triggered snippets for several of the popular modules of the scientific computing stack for Python. +Although the Jupyter environment supports polyglot programming, we have restricted our focus to Python and Markdown. +While some code snippets are one-liners, most code snippets span many lines and perform a complete task, such as generating a plot from a data file. +These libraries provide code that is known to work, unlike the situation with chatbots, which do not always return working code.

Is there any reference you can provide on the code functionality from chatbots compared to these libraries?

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649162470 :

+The latter can store a record of one's performance on a quiz. + +:::{figure} ./images/runningQuiz.png +:label: fig:quiz +:width: 130% +An example of an interactive session with a quiz in a Jupyter notebook. The code for running the quiz was inserted into the code cell with the voice command run voice in quiz. The quiz covers a range of voice commands, including [specific voice commands covered in the quiz]. +::: + +To build long-term recall of the commands, one must take the quiz five or more times on alternate days, according to the principles of spaced repetition learning. +These principles were developed by the German psychologist Hermann Ebbinghaus in the last part of the 19th Century. +They have been validated several times by other researchers. +Space repetition learning is one of the most firmly established results of research into human psychology. + +Most people need more discipline to carry out this kind of learning because they have to schedule the time to do the follow-up sessions. +Instead, most people will find it more convenient to take these quizzes several times in a half hour before they spend many hours utilizing the commands. +If that use occurs on subsequent days, then recall of the alphabet will be reinforced, and retaking the quiz may not be necessary.

What if the user does not have time to take these quizzes? Many programmers are very busy, and even an extra 20 to 30 minutes a day can be too much time for trying to improve the model on their own. If you do not retake the quiz, or stop retaking the quiz, what is the drop in accuracy of the voice-to-text translation?

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649164160 :

+:label: fig:quiz +:width: 130% +An example of an interactive session with a quiz in a Jupyter notebook. The code for running the quiz was inserted into the code cell with the voice command run voice in quiz. The quiz covers a range of voice commands, including [specific voice commands covered in the quiz]. +::: + +To build long-term recall of the commands, one must take the quiz five or more times on alternate days, according to the principles of spaced repetition learning. +These principles were developed by the German psychologist Hermann Ebbinghaus in the last part of the 19th Century. +They have been validated several times by other researchers. +Space repetition learning is one of the most firmly established results of research into human psychology. + +Most people need more discipline to carry out this kind of learning because they have to schedule the time to do the follow-up sessions. +Instead, most people will find it more convenient to take these quizzes several times in a half hour before they spend many hours utilizing the commands. +If that use occurs on subsequent days, then recall of the alphabet will be reinforced, and retaking the quiz may not be necessary. + + +### Voice In Plus

This section feels out of order. Is your library based on Voice In Plus? I think it is when I read it, but I think that needs to be further clarified. In addition, I would introduce Voice In Plus before introducing your voice library that it uses.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649165413 :

+The first setting to be set is the language that will be used during dictation. +There is support for several foreign languages and different dialects of English. +The user can also configure a keyboard shortcut that can be utilized to turn the plugin on and off. + +Voice In is offered as a freemium. +The user has to pay for an annual subscription to be able to add custom text replacements. +This full-featured version of the plugin is called Voice-In Plus (VIP). +We will focus on VIP. + +On activation of the VIP version of the plugin, the settings GUI page for custom commands is displayed for the user to use to enter commands either one by one through a GUI or by adding multiple voice commands through the text area that is opened after clicking on the bulk add button {ref}fig:newSentence. +The first option involves placing the voice trigger in one text area and the text replacement in the second text area. +The voice trigger does not need a comma after it, and the text replacement can span multiple lines without adding any markup, except that internal double quotes must be replaced with single quotes. +Any capitalization in the voice trigger will be ignored and written in lowercase. +The second option involves pasting in one or more lines of pairs of voice triggers and text replacements separated by commas, as in a CSV file. +In this option, text replacements that span more than one line must be enclosed with double quotes. +The internal double quotes must be replaced with single quotes; otherwise, the text replacement will be truncated at the position of the first internal double quote.

I would suggest shortening this section and adding the caveats and associated subscription costs to a separate section in the code's documentation that can be referred to from your paper.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649166779 :

+### Independence from breaking changes in Jupyter + +The Jupyter project lacks built-in support for code snippet libraries. +Due to the inherent limitations of the Jupyter project, the development of third-party extensions has become a necessity to support code snippets. +Unfortunately, changes in the core of Jupyter often break these extensions. +Users have to create Python environments for older versions of Jupyter to work with the snippets extension while missing out on the new features of Jupyter. +An obvious solution to this problem would be for the Jupyter developers to incorporate one of the snippet extensions into the base distribution of Jupyter to ensure that at least one form of support for snippets is always available. +Using voice-triggered snippets external to Jupyter side steps difficulty with breaking changes to Jupyter. + +### Filling gap in tab-triggered snippets with voice-triggered snippets + +Voice-triggered snippets, a promising innovation, offer a potential solution to the absence of extensions for Jupyter that support tab-triggered snippets. +Tab-triggered code snippets are standard in most text editors, whereas voice-triggered snippets have yet to become widespread in standard text editors. +One advantage of Jupyter Notebooks is that they run in the browser, where several automated Speech Recognition software packages operate (e.g., Voice-In Plus, Serenade, and Talon Voice). +We developed our libraries for Voice In Plus software because of its gentle learning curve and straightforward customization. +We did this to meet the needs of the broadest population of users.

This section should be where you justify your use of Voice In Plus.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649168122 :

+One advantage of Jupyter Notebooks is that they run in the browser, where several automated Speech Recognition software packages operate (e.g., Voice-In Plus, Serenade, and Talon Voice). +We developed our libraries for Voice In Plus software because of its gentle learning curve and straightforward customization. +We did this to meet the needs of the broadest population of users. + +### The role of AI-assisted voice computing + +The dream of AI-assisted voice computing is to have one's intentions rather than one's words inserted into the document one is developing. +Our exposure to what is available through ChatGPT left us with an unfavorable impression due to the high error rate. +GitHub's copilot can also be used in LaTeX to autocomplete sentences. +Here again, many of the suggested completions need to be more accurate and require editing. +These autocompleted sentences slow down the user by getting in the way and leaving no net gain in productivity. + +In addition, AI assistance in scientific writing has to be disclosed upon manuscript submission. +Some publishers will not accept articles written with the help of AI-writing assistants. +This could limit the options available for manuscript submission if one uses such an assistant and has the manuscripts rejected by a publisher that accepts such assistants. +

Do you have a quantitative assessment of ChatGPT's error rate, or a reference to one? I would also move and integrate this section to the introduction that provides the motivation for your Voice in Plus library.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649169204 :

+In addition, AI assistance in scientific writing has to be disclosed upon manuscript submission. +Some publishers will not accept articles written with the help of AI-writing assistants. +This could limit the options available for manuscript submission if one uses such an assistant and has the manuscripts rejected by a publisher that accepts such assistants. + +### ASR extensions for Jupyter lab + +We found three extensions developed for Jupyter Lab that enable speech recognition in Jupyter notebooks. +The first, jupyterlab-voice-control, supports custom commands and relies on the browser's language model. +This extension is experimental and not maintained; it does not work with Jupyter 4.2. +The second extension, jupyter-voice-comments, relies on the DaVinci large language model to make comments in Markdown cells and request code fragments. +This program requires clicking on a microphone icon repeatedly, which makes the user vulnerable to repetitive stress injuries. +The third extension is jupyter-voicepilot. +Although the extension's name suggests it uses GitHub's Copilot, it uses whisper-1 and ChatGPT3. +This extension requires an API key for ChatGP3. +The robustness of our approach is that the Voice-In Plus software will always operate within Jupyter Lab when Jupyter is run on a web server. +

I would make this into a table in your introduction to motivate why your library is used.

In papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649170704 :

+Newer language models can often accurately transcribe your words using the internal microphone from your laptop or desktop computer. +Contrary to the prevailing advice in some quarters, a high-quality external microphone may not be required. +The microphone in our 2018 MacBook Pro works well with Voice In Plus. + +Fourth, one can inadvertantly change the case of words while dictating in Voice In Plus. +To switch back to the default case, one need to navigate to the options page and select the text transform button to open a GUI that lets you set the case globally. +This event occurs about once every 100 hours of dictation. + +Fifth, a related problem is the inadvertent activation of other voice computing software on one's computer. +For example, once in about 100 hours of dictation, one will say a phrase that resembles Hey, Siri. +Siri will then respond. +One solution is to inactivate Siri so that it cannot respond to one's speach. + +These caveats are minor annoyances. +We think that productivity gains out of the disruptions caused by these annoyances. +

I don't think this section is needed. It can be moved to a caveats or "common issues" section in the documentation, with the link provided in the paper.

On papers/blaine_mooers/main.md https://github.com/scipy-conference/scipy_proceedings/pull/934#discussion_r1649173092 :

I found the concept of the authors' library to be very compelling. I've had an intern who could not use his hands, so having something like this would have been very helpful. However, when I read the paper, I found the organization to be disjoint with various aspects of the paper in the wrong section. I've provided inline suggestions for rearranging the paper.

In addition, I think, for me to accept the publication, the authors need to either cite or perform a quantitative analysis of the code failure rate of ChatGPT compared to Voice in Plus to show that this is an improvement for Python and code markdown cell writing compared to current tools.

— Reply to this email directly, view it on GitHub https://github.com/scipy-conference/scipy_proceedings/pull/934#pullrequestreview-2132943188, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTZEC7EEZXXKNDNETIYVCDZKLWZNAVCNFSM6AAAAABIT6HN2CVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMZSHE2DGMJYHA . You are receiving this because you authored the thread.Message ID: @.***>

-- Best regards,

Blaine

Blaine Mooers, Ph.D. Associate Professor Department of Biochemistry and Molecular Biology College of Medicine University of Oklahoma Health Sciences S.L. Young Biomedical Research Center Rm. 466 975 NE 10th Street https://maps.google.com/?q=975+NE+10th+Street&entry=gmail&source=g, BRC 466 Oklahoma City, OK 73104-5419

office: (405) 271-8300 lab: (405) 271-8313

Faculty webpage http://basicsciences.ouhsc.edu/biochemmolbiol/Faculty/bio_details/TabId/11753/ArtMID/30702/ArticleID/6430/Mooers-Blaine-HM-PhD.aspx X-ray lab (LBSF) http://research.ouhsc.edu/CoreFacilities/LaboratoryofBiomolecularStructureandFunction.aspx SSRL UEC https://www-ssrl.slac.stanford.edu/content/about-ssrl/advisory-panels/ssrl-users-organization/members/ssrluo-2016-executive-committee-members SSURF EasyPyMOL https://github.com/MooersLab/EasyPyMOL Molecular Graphics https://www.oumedicine.com/docs/default-source/ad-biochemistry-workfiles/moleculargraphicslinks.html

https://www.oumedicine.com/docs/default-source/ad-biochemistry-workfiles/MolecularGraphicsLinks.html Small Angle Scattering http://www.oumedicine.com/docs/default-source/ad-biochemistry-workfiles/small-angle-scattering-links-27aug2014.html?sfvrsn=0 office: (405) 271-8300 lab: (405) 271-8313 e-mail: @. (or @.)

scipy-conference / scipy_proceedings

Paper: Voice Computing with Python in Jupyter Notebooks #934

@.**** commented on this pull request.

I would recommend rephrasing the sixth sentence. I did not see anything in here with regards to nucleic acids and proteins, so I would make this sentence more general, since you show an example of an equation being rendered.

How easy is the installation for someone who cannot use their hands to type? Have you experimented with using voice commands for the conda install?

suggest adding "Python" before code

Suggest removing and incorporating the previous suggestion to make the paper more concise.

Is there any reference you can provide on the code functionality from chatbots compared to these libraries?

This section feels out of order. Is your library based on Voice In Plus? I think it is when I read it, but I think that needs to be further clarified. In addition, I would introduce Voice In Plus before introducing your voice library that it uses.

I would suggest shortening this section and adding the caveats and associated subscription costs to a separate section in the code's documentation that can be referred to from your paper.

This section should be where you justify your use of Voice In Plus.

Do you have a quantitative assessment of ChatGPT's error rate, or a reference to one? I would also move and integrate this section to the introduction that provides the motivation for your Voice in Plus library.

I would make this into a table in your introduction to motivate why your library is used.

I don't think this section is needed. It can be moved to a caveats or "common issues" section in the documentation, with the link provided in the paper.