Closed giuliataurino closed 11 months ago
Hello @giuliataurino and @jeffblackadar,
It is great to see this Issue opened, and the process of review beginning!
A reminder that we host the lesson Markdown file and any related assets (images or data) here on our ph-submissions repo.
I've uploaded the Markdown file to /en/drafts/originals/transcribing-handwritten-text-with-python-and-azure.md, where we encourage you to make direct edits going forwards (no need to use the PR system).
I've uploaded the lesson's images to /images/transcribing-handwritten-text-with-python-and-azure.
@jeffblackadar, you will note that I have removed two images which were in the /images folder on your repo but are not referenced in the lesson Markdown file.
You'll also note that I've adjusted the syntax to display images. We use liquid and we require this format (example):
{% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Caption text to display" %}
. I've plotted in the minimum needed, and we can return to add alt-text during the review/revision process.
Jeff, I'd like to ask if you could take another look at the footnotes. I notice that [^3]
appears twice, and overall the placement seems a little odd. We can work together to confirm these details if you have any questions.
@giuliataurino, you will notice that I've added in our YAML header, which is required to generate an online preview of the lesson. It is now ready for you to read: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/transcribing-handwritten-text-with-python-and-azure. (I've updated the link you posted in your initial comment).
Thanks for this Anisa,
Footnotes: I added a footnote 4 (Ibid), I had to use a pull request, Github says my id (jeffblackadar) does not have access to update the project. Jeff
On Fri, Oct 7, 2022 at 7:16 AM Anisa Hawes @.***> wrote:
Hello @giuliataurino https://github.com/giuliataurino and @jeffblackadar https://github.com/jeffblackadar,
It is great to see this Issue opened, and the process of review beginning!
A reminder that we host the lesson Markdown file and any related assets (images or data) here on the ph-submissions repo.
I've uploaded the Markdown file to /en/drafts/originals/transcribe-handwritten-text-with-python-and-azure.md https://github.com/programminghistorian/ph-submissions/blob/gh-pages/en/drafts/originals/transcribe-handwritten-text-with-python-and-azure.md, where we encourage you to make direct edits going forwards (no need to use the PR system).
I've uploaded the lesson's images to /images/transcribe-handwritten-text-with-python-and-azure https://github.com/programminghistorian/ph-submissions/tree/gh-pages/images/transcribe-handwritten-text-with-python-and-azure .
@jeffblackadar https://github.com/jeffblackadar, you will note that I have removed the two images which were in the /images folder on your repo https://github.com/jeffblackadar/mre/tree/main/docs/images but are not referenced in the lesson Markdown file.
You'll also note that I've adjusted the syntax to display images. We use liquid and we require this format: {% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Caption text to display" %}. I've plotted in the minimum needed, and we can return to add alt-text during the review/revision process.
I'd like to ask if you could take another look at the footnotes. I notice that [^3] appears twice, and overall the placement seems a little odd. We can work together to confirm these details if you have any questions.
@giuliataurino https://github.com/giuliataurino, you will notice that I've added in our YAML header, which is required to generate an online preview of the lesson. It is now ready for you read: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/transcribe-handwritten-text-with-python-and-azure. (I've updated the link you posted in your initial comment).
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1271458729, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET3QZVUIALF4SV5GONDWCAA7ZANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Apologies, @jeffblackadar. I've just sent you an invitation to be a collaborator on ph-submissions. Please let me know if you've received this? It will enable to you make direct changes to your lesson.
In the meantime, I've merged in your changes.
Thanks Anisa, I have the invitation. Jeff
On Fri, Oct 7, 2022 at 9:10 AM Anisa Hawes @.***> wrote:
Apologies, @jeffblackadar https://github.com/jeffblackadar. I've just sent you an invitation to be a collaborator on ph-submissions. Please let me know if you've received this? It will enable to you make direct changes to your lesson.
In the meantime, I've merged in your changes.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1271573609, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET6PAJE6VIK5EA4VKCLWCAON3ANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Thank you Anisa for helping with the process.
And @jeffblackadar, thank you for the submission, I'll review it and get back to you in the following weeks.
Giulia
Hi @jeffblackadar,
Thank you for this tutorial, I found it very clear and useful. The lesson is overall straight forward and accessible to an entry-level audience, as it provides a detailed description of tools needed, technical dependencies, and commentary of the code. The walk-through for setting up Azure and transcribing handwriting from images was helpful for making the workflow smooth and easy. The code worked without errors on my end. There is no major revision I envisage, but I do have a few minor suggestions. You can read a more detailed feedback and comments below.
General suggestions:
Introduction
[x] Section 1: The tutorial includes references and links to previously published lessons on PH English that help the reader deepen the subject and acquire contextual skills. It might be useful to mention other OCR tutorials used for similar tasks but on typewritten documents, as found here: https://programminghistorian.org/en/lessons/working-with-batches-of-pdf-files; https://programminghistorian.org/en/lessons/cleaning-ocrd-text-with-regular-expressions; https://programminghistorian.org/en/lessons/generating-an-ordered-data-set-from-an-OCR-text-file.
[ ] Section 5: While the objectives and possible challenges of existing OCR models for handwriting transcription are well presented in the introduction, I suggest to give a more explicit example of the cases in which the model is likely not going to perform well (e.g. under-documented languages; certain file formats; poor resolution or unclear handwriting).
[ ] Section 6: You might want to expand on the choice of focusing on Azure, as opposed to other softwares, for this lesson - maybe in relation to existing case studies or for usability reasons, or else in connection with technical accessibility (e.g. was it already used in other digital humanities projects? is the software well documented? does it need some amount of coding, hence the tutorial?). Additionally, you can add a brief section with a description of the goals of this tutorial and its usability for further applications.
Prerequisites
Procedure
Summary
Specific comments: (@anisa-hawes, feel free to step in if I’m missing some aspects regarding the editorial guidelines)
Thank you again for your time and work.
Thank you, @giuliataurino.
I've corrected the link in the introductory paragraph – there was a rogue bracket causing a problem!
I've also made some interventions where I notice that numbering sequences are interrupted by a figure of a code block and then re-start from 1.
. This problem is fixed in Markdown by adding a backslash after the number like this: 1\. 2\.
but I'm not 100% sure I've caught all of them...@jeffblackadar I wonder if you could double check this?
I'd like to recommend reducing/removing the numbered sequences within the sections. Within such a short lesson, I think the sub-section numbering could become more confusing than clarifying. For example, in Section 4. Install Azure Computer Vision on your machine, there are two numbered sub-steps. I think these could simply be sentences:
Create a new cell in your notebook [...]
Create another new cell [...]
Let me know what you both think.
Hi Giulia @giuliataurino and Anisa Thank you very much for this feedback. Some responses below:
Section 1: The tutorial includes references and links to previously published lessons on PH English that help the reader deepen the subject and acquire contextual skills. It might be useful to mention other OCR tutorials used for similar tasks but on typewritten documents, as found here: https://programminghistorian.org/en/lessons /working-with-batches-of-pdf-files; https://programminghistorian.org/en/ lessons/cleaning-ocrd-text-with-regular-expressions; https://programminghistorian.org/en/lessons /generating-an-ordered-data-set-from-an-OCR-text-file.
Jeff: These are added
Section 5: While the objectives and possible challenges of existing OCR models for handwriting transcription are well presented in the introduction, I suggest to give a more explicit example of the cases in which the model is likely not going to perform well (e.g. under-documented languages; certain file formats; poor resolution or unclear handwriting).
Section 6: You might want to expand on the choice of focusing on Azure, as opposed to other softwares, for this lesson - maybe in relation to existing case studies or for usability reasons, or else in connection with technical accessibility (e.g. was it already used in other digital humanities projects? is the software well documented? does it need some amount of coding, hence the tutorial?). Additionally, you can add a brief section with a description of the goals of this tutorial and its usability for further applications.
Jeff: I've made alterations. As background, I have only seen 1 other comparison of the services (non academic). I did a comparison ( https://jeffblackadar.ca/uncategorized/handwriting-transcription-of-a-fieldbook-with-microsofts-azure-cognitive-services-and-amazons-aws-textract/) but I have not collected statistics on accuracy. Microsoft looks to be the leader, but I don't have strong evidence to say that. One advantage of Microsoft is the free tier of 5000 images per month. Amazon is much less generous. Google also.
You will want to make sure that the prerequisite skills are correctly listed. In this respect, what are the platforms supported (e.g. Windows, SaaS/Web)?
Jeff: the supported platforms are open, as long as Python is available and there is an internet connection.
For example, in Section 4. Install Azure Computer Vision on your machine
Jeff: I changed these to sentences
Can the model be re-trained on a different set of data?
Jeff: While this is a closed box and can't be trained, I'll add a note about that.
If the reader does have knowledge of Python, can this code be adapted to perform more advanced tasks (e.g. loop through multiple files)?
Jeff: It can, I'll expand on this.
Which part of the code - if any - is likely to return an error?
Jeff: I would say the secret key is most likely to cause a problem. There is a note in section 3b.
This problem is fixed in Markdown by adding a backslash after the number like this: 1. 2. but I'm not 100% sure I've caught all of them... @jeffblackadar https://github.com/jeffblackadar I wonder if you could double check this?
Jeff: I checked and added some newlines to a couple places like section 3b
Much appreciated, Best wishes for the New Year! Jeff
On Wed, Dec 14, 2022 at 1:35 PM Anisa Hawes @.***> wrote:
Thank you, @giuliataurino https://github.com/giuliataurino.
I've corrected the link in the introductory paragraph – there was a rogue bracket causing a problem!
I've also made some interventions where I notice that numbering sequences are interrupted by a figure of a code block and then re-start from 1.. This problem is fixed in Markdown by adding a backslash after the number like this: 1. 2. but I'm not 100% sure I've caught all of them... @jeffblackadar https://github.com/jeffblackadar I wonder if you could double check this?
I'd like to recommend reducing/removing the numbered sequences within the sections. Within such a short lesson, I think the sub-section numbering could become more confusing than clarifying. For example, in Section 4. Install Azure Computer Vision on your machine, there are two numbered sub-steps. I think these could simply be sentences:
Create a new cell in your notebook [...]
Create another new cell [...]
Let me know what you both think.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1351941816, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET45FNN3OBEV7TK5OCTWNIHPTANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Happy New Year, @jeffblackadar.
Thank you for your responses and updates. I'm still not 100% sure about the numbering. I am going to have another look, but my initial instinct is that the sub-sections numbered 1-6 within Images to Transcribe may be better as simple sub-titles. Sub-sub-section 3.B. Create a notebook looks a little odd in the table of contents.
I'm tagging @giuliataurino here + above to ensure a notification.
Many Thanks Anisa Is there anything you need me to do? Thanks Jeff
On Wed, Jan 11, 2023 at 10:06 AM Anisa Hawes @.***> wrote:
Happy New Year, @jeffblackadar https://github.com/jeffblackadar.
Thank you for your responses and updates. I'm still not 100% sure about the numbering. I am going to have another look, but my initial instinct is that the sub-sections numbered 1-6 within Images to Transcribe may be better as simple sub-titles. Sub-sub-section 3.B. Create a notebook looks a little odd in the table of contents.
I'm tagging @giuliataurino https://github.com/giuliataurino here + above to ensure a notification.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1378899730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET6SKBMXGFOL6FB3R4TWR3EABANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Dear @jeffblackadar.
Thank you for your message. There's nothing further we need from you at this stage. @giuliataurino is coordinating the peer-review process and will be in touch to let you know who will be contributing.
Many Thanks Jeff
On Wed, Mar 8, 2023 at 2:00 PM Anisa Hawes @.***> wrote:
Dear @jeffblackadar https://github.com/jeffblackadar.
Thank you for your message. There's nothing further we need from you at this stage. @giuliataurino https://github.com/giuliataurino is coordinating the peer-review process and will be in touch to let you know who will be contributing.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1460705840, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET5RAUPCR7G5W4P2FV3W3DJNJANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Dear @jeffblackadar,
Apologies for the delay. I'm glad to announce that @mdermentzi will be reviewing your submission! She will be posting her review as a comment to this GitHub issue in the following month or so.
Thank you for your patience as we move forward towards publication.
Best,
Giulia
Wonderful, Thank you both! Jeff
On Fri, Apr 21, 2023 at 1:36 PM Giulia Taurino @.***> wrote:
Dear @jeffblackadar https://github.com/jeffblackadar,
Apologies for the delay. I'm glad to announce that @mdermentzi https://github.com/mdermentzi will be reviewing your submission! She will be posting her review as a comment to this GitHub issue in the following month or so.
Thank you for your patience as we move forward towards publication.
Best,
Giulia
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1518141095, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETYMYGFSC3SPDD633LDXCLAQNANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Hi @jeffblackadar,
Thank you for this very useful tutorial. You’ve made it accessible enough so that someone without prior experience with Python or handwritten text recognition can follow along and start transcribing handwritten documents with minimal effort. My view is that it will be valuable to historians as well as archivists who need to perform such work for research or cataloging purposes. Once it’s published, I will definitely start using it and recommend it to the historians and archivists with whom I collaborate.
I didn’t notice any serious issues with this tutorial. Most of my suggestions seek to simplify the structure or anticipate questions that beginners might have while following the steps. Mind you, I’m reviewing this lesson having just a week ago delivered a how-to workshop to historians with varying tech skills using Google Colab; seeing what they struggled with, my comments are aimed at ensuring that even complete beginners will find this tutorial as easy as possible. I didn’t focus on copy editing or compliance with the author guidelines, leaving these for the editors to check.
Overall, my main suggestion would be to make it clear early on that you recommend users follow this tutorial using Google Colab and prioritise this type of users throughout your instructions. This will make the prerequisites section more straightforward and the tutorial easier to follow. If you choose to do this, adding more screenshots of the Colab environment is also important.
Here’s my detailed feedback:
At the beginning of the tutorial, the reader might benefit from a short and concise Learning Objectives or learning goals section similar to other Programming Historian tutorials.
Par 1, final sentence – add OCR abbreviation in this paragraph and from here onward use OCR
Paragraphs 2, 3 & 4: It is quite possible that I might have missed something but, having tried to find out what model is powering the Azure computer vision service showcased in this tutorial, my understanding is that Microsoft does not clarify what model architectures (or what datasets as the tutorial rightly points out) they have used to train the models powering their APIs. For this reason, a focus on CNNs might be a bit redundant; it is perhaps giving the idea that this is what is powering the Azure service, which may or may not be true. For example (and this is not my area of expertise–I only did a quick search for this so I could be wrong), in recent years, transformers have also been used for OCR. My suggestion, therefore, would be to remove direct mentions of CNNs, as they might additionally be alienating to beginners.
If, however, the purpose of referring to the CNNs is to provide more context about the progress of this field, my suggestion would be to add a disclaimer clarifying that we don’t know how the Azure service, which is showcased later in the tutorial, works. Additionally, if this part is kept, I would suggest expanding on it a bit more, citing even key papers that led to relevant breakthroughs. From my experience working with historians in Europe, many of them have tried such tools before (granted, without having trained custom models) and are sceptical about their success, so recent advances in AI might encourage them to give handwriting recognition another go.
More detailed suggestions per paragraph:
Par 2: I’d recommend starting this paragraph with “Digitally transcribing [...]” It’d be helpful to remove the parentheses and better integrate the PH references in one or more sentences starting with something like “Previous programming historian tutorials that have demonstrated typed text recognition include: ” Consider adding one more reference to the latest PH OCR lesson that uses the Google Vision API (https://programminghistorian.org/en/lessons/ocr-with-google-vision-and-tesseract) either here or in another paragraph. And then you could continue the paragraph by adding the first sentence but with small changes, such as: Recent advances in artificial intelligence offer the ability for historians to automatically transcribe handwritten documents [...] In the bit where it says “within limits of types of letters used, language and legibility.”, the expression “types of letters” might read better and be more inclusive if changed to “writing systems” Final sentence: Remove mention of CNN and add another disclaimer here to make it clear that this is only true for certain writing systems and languages so that readers won’t get disappointed if they get bad results when trying this with images including texts written in lower-resource languages and writing systems.
I would cut paragraph 3 and keep paragraph 4 but remove the CNN bit towards the end. To make up for removing these parts, you could add another sentence somewhere explaining that these models are only as good as the data on which they were trained and advising historians to keep in mind that their results will reflect their training data, with all the biases stemming from how and by whom the training dataset was put together.
Par 4: “as long as these documents are recognizable to the service. ” – expand on what recognizable means in this context. For example, recognizable in terms of the writing system used, language, file type, etc. final two sentences, fix “is not” to are not. Might be best to cut from “I assume [...] property”. Final sentence, it’d be interesting to know what this assumption is based on. Is it based on personal experience using some of these services or is it based on how similar models of which we know the details work?
Par 6: In this paragraph, I would strongly recommend adding what languages are currently supported.
Par 7: It would be interesting to read what scripts and languages you’ve tried it with.
Prerequisites section: First requirement: I’d suggest changing to something like “Knowledge of Python is not required since all of the code is provided in the tutorial. That said, basic Python knowledge would be useful for users who wish to understand the code or to tweak it for their purposes.”
Second requirement: I’d suggest changing to “Google Colab, a web-based virtual Python programming platform, was used to write this lesson. If you choose to use Google Colab to program Python (recommended), a Google account is required. If you choose to run the code in this tutorial locally on your own machine, Python and pip
need to be installed.”
Also, it would be good to check if there is a specific version of Python required (I think it's 3+) and, if so, add this to the text. Perhaps add another footnote here to point to the python-sdk quickstart guide found later within the text.
Fourth requirement: Change to “Credit card or debit card” so that those with no access to credit are not discouraged.
Consider whether you want to address users who are already familiar with Google Colab or not. If familiarity with Google Colab is not listed in this section, there could be more screenshots and explanations about how to create new cells and run cells in Google Colab after paragraph 39. My recommendation would be to simply add more screenshots and instructions, because this tutorial could be easily followed by beginners as long as they’re not getting confused by simple steps that might, instead, come intuitively to more experienced users.
Procedure section The procedure section (par 8) followed by the separate Images to transcribe section (par 9) is making the structure of this tutorial slightly confusing. I would suggest either flipping the two sections or making steps 5 and 6 of the procedure section subsections of a parent section called “Transcribe handwriting”, which would start with the “Images to transcribe section” subsection.
So perhaps a better structure would be: Contents (Learning Objectives) Introduction Prerequisites Procedure
Register for a Microsoft account.
Create a “Computer Vision” Resource in Azure to perform transcription.
Store a secret Key and Endpoint to access Computer Vision from your machine.
Install Azure Computer Vision on your machine.
Transcribe handwriting
Summary Bibliography Footnotes
Par 9: In the Images to transcribe section, I would start by saying that “Microsoft’s Azure Cognitive Services require that images used [...]”.
Par 10-16: Create a “Computer Vision” Resource in Azure to perform transcription. When following this process, I didn’t get a “start with an Azure free trial” message. Instead, I got a “Checking on your subscription” message and then Azure asked me to upgrade my account. Apparently, I was not eligible for an Azure free account, and so I had to sign up for Azure with the pay-as-you-go pricing. This didn’t imply that I actually had to pay anything, but it felt unclear and intimidating. Therefore, it might be useful to update the text of the tutorial so as to include this as a potential scenario for those who don’t see the Azure free trial prompt and clarify that they won’t actually get charged, because there are free quotas available in the pay-as-you-go subscription (unless they have already spent them).
Par 20: Instead of Azure subscription 1, there was a second option “Free trial”, which is the one that I selected. I can see that there have been many months since this tutorial was first submitted, so it might be worth going over the process once again to check if these instructions are up to date. The rest of the instructions including Par 22 were correct. (Pricing tier to Free F0 etc)
Par 28-29: Here, beginners would benefit from more information on what an endpoint and keys are. Consider adding a footnote to offer some context.
Par 30: This paragraph might be confusing to users who’ve been following the tutorial using Colab and have not created any folders. Also, perhaps it’d be more straightforward to start the paragraph with the sentence that is currently last in this paragraph and make the distinction of what users need to do depending on whether they’re using Github or not. In any case, make it clear that these keys are not meant to be shared with anyone under any circumstances. Also, consider integrating this paragraph later in the text into par 34, where users are asked to copy KEY 1.
Par 36 Make Colab link clickable. Par ends with duplicate closing parentheses.
Par 38 onward: I’d recommend prepending “Colab” before every instance of the word “notebook” in the remainder of this text to avoid confusion and make it clear that the instructions are tailored mainly to Colab users. Par 39 Keep in mind that users might lack familiarity with Google Colab. Statements that might be intuitive to some, such as “Create a new cell” or “run a cell”, might not be as obvious to the uninitiated. Within the body of the text in this instruction, specify that, after copying the code, readers must also change the currently existing endpoint in the code to their own endpoint that they’ve previously copied from the Azure environment and make sure that it will be enclosed in quotation marks. Perhaps a screenshot from the Google Colab environment would be helpful here.
Par 40: Explain how one might run the cell. Also, specify that after running this cell they will get prompted for their secret computer vision key (KEY 1), which they need to paste inside the input box, and that they’re expected to hit Enter.
Par 41 Perhaps add another screenshot here. Also, it might be helpful to explain what they should do if they get an error. Should they rerun the cell? If so, add it to the text or to the error message in the code.
Par 42 I would say prioritize users running this on Google Colab as the preferred way to follow this tutorial and consider removing “on your machine” from the title to avoid any confusion that readers might actually need to install something on their devices. Flip the order of the two final sentences. Consider adding a footnote on what a session is (although not important). Also, for users who run this locally, consider flagging that if the pip install line is not run through a notebook but rather on the command line, then they should remove the exclamation mark. The previous comment about how to create a new cell is also applicable here.
Par 43 The previous comment about how to create a new cell is also applicable here.
Par 44 Is this a public domain image and is it OK copyright-wise for others to use it while following the tutorial? If so, it’d be a good idea to mention it here so that readers know it’s safe to use it. Perhaps coordinate with the PH team to save it under their domain to ensure greater chances of sustainability for this tutorial and don’t forget to update the links. Also, what happens to the images that are getting processed by the Azure Computer Vision API? In certain cases, researchers might not be permitted to transfer their data to third parties. Therefore, it might be a good idea to add a disclaimer here or a link detailing how Azure is processing data sent to them through this kind of APIs.
Par 46 In this paragraph, I would suggest adding one more sentence to explicitly say that if readers want to try this method with their other images stored online they should replace the existing link after the comment “# Get an image with text. Set the URL of the image to transcribe.” with the link to the image that they’ve found online (and are permitted to use) in quotes. Alternatively, the same note could be added at the end of par 48.
Par 48: Consider expanding on what “Call Azure using computervision_client with the URL.” means. Beginners might not be familiar with API calls. Consider adding a screenshot of the result and commenting on it. This will not only help users know what to expect but will also give them a sense of how accurate this method can be.
Par 49 Same note as above regarding permission to use the image.
Par 50 Consider adding a screenshot to show how one might do this in Colab. Sometimes the vertical bar on the left can easily go unnoticed.
Par 54 Add a note that Colab users need not change this.
Par 54-56 There seems to be something wrong with markdown here. Make sure the post appears as intended.
At this point, as a more experienced user, I would be interested to know whether there are any parameters that I can tweak when making the API calls (such as the language that I’m interested in transcribing) to get more accurate results. Consider adding a link that will point more advanced users to further documentation.
Finally, I’d recommend this lesson to be aimed at beginners (provided that they reproduce it using the Google Colab route).
This tutorial is an important and enjoyable read; congratulations, and thanks a lot to the editors for giving me the chance to review it.
Kind regards, Maria
@giuliataurino
PS: Feel free to let me know if you have any questions or need any clarifications when it comes to my feedback.
Thank you for your review @mdermentzi!
@jeffblackadar, as I am still looking for a second reviewer I was wondering if you were able to take a look at this first review. Let me know if you have any questions!
Best,
Giulia
Hi Giula I have read the first reviewers comments but have not made them yet. I will get that done soon. I have had some parent care to do recently so I have not been at my computer as much. Thanks for your patience. Jeff
On Thu, Jun 29, 2023 at 11:31 AM Giulia Taurino @.***> wrote:
Thank you for your review @mdermentzi https://github.com/mdermentzi!
@jeffblackadar https://github.com/jeffblackadar, as I am still looking for a second reviewer I was wondering if you were able to take a look at this first review. Let me know if you have any questions!
Best,
Giulia
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1613405691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETZ3BZPX2TVX3SUV3RTXNWNTVANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Hi @jeffblackadar and @giuliataurino,
Thank you for the opportunity to read this tutorial as a second reviewer! As someone with limited familiarity with handwriting transcription, I found the directions easy to follow and the code simple and efficient to use. I think it will become a valuable resource for historians. Like @mdermentzi, most of my comments are related to structure, or in anticipation of questions a beginner-level audience might have about Microsoft Azure and Google Colab.
P1-Your introduction clearly sets up the need for digital handwriting transcription. Though the historians you’re speaking to might not require much convincing, I’d love to see a tangible example of a handwritten document which would be beneficial to transcribe. I’d also recommend cutting off that paragraph at the second-to-last sentence and shifting your focus to digitization in the next, given that you spend some time coming back to this in p2. Along those lines, I’d echo @mdermentzi's comments to hone the focus of the first 2-3 paragraphs with your endpoint of working with Microsoft Azure in mind. I am also unfamiliar with the software, but from a quick search I found these descriptions on Microsoft’s website related to how their Computer Vision works:
“The image is then sent to an interpreting device. The interpreting device uses pattern recognition to break the image down, compare the patterns in the image against its library of known patterns, and determine if any of the content in the image is a match.”
“With deep learning, a computer vision application runs on a type of algorithm called a neural network, which allows it deliver even more accurate analyses of images."
(Source: https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-computer-vision/)
To me, this seems like a cross between the traditional OCR approaches you are describing in p2 and the CNNs in p3–though the term “interpreting device” is pretty nebulous! That said, you might do well to position your discussion in p2-3 as a general one, related to the various approaches of OCR, perhaps emphasizing how machine learning methods generally perform better than and/or enhance dictionary-based methods given the complexities of handwriting analysis (if that’s indeed the case). I’d recommend clarifying the relationship between CNNs and OCR too (CNNs are used for OCR, right? They’re not two separate technologies?) In any case, your contextual discussion could feed into a note at the start of p4 that acknowledges how commercial services use a combination of these methods, however transparent (or not) they make their approaches.
In paragraphs 4-6, you acknowledge a lot of important limitations to handwriting transcription. Speaking again as someone outside this field, it did seem like the emphasis was more on the limits than the benefits of this type of analysis. Perhaps this focus on critique is just realistic, and/or perhaps the benefits are implied, but could you expand any further on why, specifically, Microsoft Azure is a viable option for this type of work? I personally don’t have a frame of reference for Google/AWS accuracy and would be curious to hear more about it–what are the benchmarks beginners should look for when evaluating this type of service, and how does Azure measure up? This could be something you integrate later in the tutorial below–see my comments about discussing the sample image output.
P6-7-You introduce the tool as “Microsoft Azure Cognitive Services,” and I think a little more information about what the platform is could be helpful. Having mostly worked with Abbyy before, I was envisioning a desktop app from Microsoft, but obviously that’s not what it is. More context about Azure (and “Computer Vision” as a resource within it) might also make the last line of p7 clearer–you’re saying there’s documentation around it, but not around the coding aspect of it? And/or are there other coding platforms that its use has been documented for, but you’re contributing with a Python tutorial? Leaning into a focus on Colab users, I think, would help here, as you’d be positioning this tutorial as a simple-to-use pipeline that doesn’t require any purchases or local software downloads.
P7/Prerequisites–access to an internet connection seems to be implied, given the other prerequisites listed. Clarify that you don’t need to install Python on your machine if you’re using Google Colab, and point toward Colab tutorials here. You might also want to give context for the telephone number, like the credit/debit card, since it’s somewhat unusual. You might also want to clarify in parenthesis that “Though there is a free tier of service for Microsoft, you are required to put a credit card on file.”
In Procedures (P8), perhaps clarify what you mean by “install Azure on your machine.” This too made me think Azure was a desktop platform like Abbyy, but it’s actually something to be installed in a coding environment, and doesn’t even need to be installed locally if you’re using Colab. Same with “access Computer Vision” from your machine”. Especially since you are using “stored on your machine” to reference a locally stored file in step 6, just tweak the use of these terms above.
I second @mdermentzi's comments to nest the images to transcribe within your procedures. This section could be more readable/skimmable if you structured it in a bulleted list, like as follows:
Image Requirements:
I’m not sure you need all the sentences about conversion (as you acknowledge, it’s outside your scope and would assumedly be an implied step) and you could still put the caveats about experimentation below the list.
P9-It might be smoother to have a line before you start the numbered directions saying, “If you already have a personal Microsoft account, skip this section.” Along those lines, perhaps clarify here that you need a PERSONAL account, rather than a school/organizational one. As noted below, I ran into trouble trying to use a school account for this because I could not change my access to the feature and input a credit card.
P9–it might be more straightforward to direct users to the general Microsoft login in page (https://account.microsoft.com/account/Account) to register for an account, especially since the first step of step 2 is to again go to portal.azure.com.
P10 to 16-When I tried to sign in with my personal Microsoft account, the steps you outlined worked perfectly. However, when I tried with my school account, I got an error message stating the feature was disabled through my school’s subscription. Not sure how common of an issue this would be, but perhaps put a disclaimer to make sure to use a personal account, rather than a school/organization account for this process.
P22–The plus signs breaking up your sentences make it a little choppy to read, so perhaps just remove pulse sides and say “Select a region, name the instance")
P24–Before I clicked review, I saw that I also had the option to set the network, identity and tags for project. I understand these are set by default, but maybe talk through the options/why they’re important/why you can just leave them alone?
P28–Clarify what you mean by “access this service through the computer” here as above.
P30–I would also like more clarity on the function of the key and the endpoint. Why does it give you 2 keys, and why do you only need one of them? Why is it important to keep the key a secret?
P30–I’m not quite sure what you mean by “check your code into a repository.” Is this just another way to say upload? And what do you mean by “avoid checking the file”--just erasing the key before you upload it to the repository?
P35–List this as a regular paragraph, rather than the 4th step in this process–I nearly regenerated my key and endpoint after copying/pasting/saving them and would have had to do the whole process again.
P36-I’d recommend making 3B a completely separate step (4) since we are really switching topics to working in a Python environment now
P36–Say “Create a Google Colab (or Python) notebook in p 36 header and beyond
P39–Consider giving a little more context about what an “environment variable” is, why to import os package, and what “basic validation” means.
P40–Clarify that you have to run the cell and then copy/paste your key into it before the output is generated.
P41–Why is it important to delete the text of your key?
P42–Yes, again clarify that you’re installing it in a Python environment, and that it’s not a local install when using Google Colab.
P43–Perhaps say “the code below” instead of “this code”; I thought you were referring to the code above, as there’s a bulleted list between the instruction and the code. You might give some more info on what libraries and authentication processes are for people not familiar with them.
P44–If possible, I’d be interested in learning more about the sample image, the challenges of transcribing it manually, and why you think computer vision would be valuable–in a sense, a tie-in to the demands you address at the beginning of this tutorial. If you’re bringing the image paragraph down here, you could also note why (or why not) this image is an ideal one to work with
P46–”Create” rather than “open” a new cell.
P46–It was helpful to see the bulleted list of outcomes in P43, before you ran that code cell. Here you tell users to run it, and then go back and describe each line below, but I’m wondering if reversing those things would be more instructive.
P48–Clarify that you have to change the URL to the image you are transcribing
P48–Could you include screenshots or code for the last two steps (read the results line by line and print the text if successful)? I think it would be particularly helpful to see what you mean by “the coordinates of a rectangle” and why that’s valuable information to have.
P48–Not sure how complicated this would be to add, but I do think an additional step where you discuss how to export the results would be useful. Even if it’s just a note to copy/paste the text into a txt file, or if there’s a more readable format you could generate that also stores the lines separate from the coordinates (for readability).
P48–It might also be helpful to put the output and your file side-by-side here, to see how the output compares to the text, and reflect briefly on its accuracy and value. Especially as you discussed critiques of the process in the beginning, it could be helpful to model your own process for discerning what is useful and what is limiting about this tool.
P50–Consider providing more detailed guidance (and screenshots) for new Colab users who may not know how to upload file to a directory.
P55–I think this is supposed to be code, next line is supposed to be text, then code again (just adjust formatting of blocks)
P56–Same as above, share code/screenshots for last two bullet points and consider a brief model of evaluating output.
P57– I was left wondering how to go from your code to the next steps you described (processing multiple images, storing transcribed text in a file/database). I know this is a beginner tutorial, and I’m not sure how complicated it would be to add any of these steps, but even saving output in a file seems like a valuable addition. Additionally, it might be interesting to share a different sample image when you walk through the process of transcribing a local image, perhaps a map or a spreadsheet like you are describing above, so you have another type of file to show output of. Just a suggestion, and it shows that your tutorial is piquing my interest about the capabilities of this tool.
P58–What do you mean by “customize the training” and why isn’t it possible at this time? Also, this is purely subjective, but consider ending on an even stronger note! Your tutorial makes me intrigued about the possibilities of this type of analysis and I think you could say more on this here–for example, you could circle back to specific use cases or reflect on how exactly it could continue to grow (if this is something you’re excited about).
Overall, I really enjoyed reading this tutorial and learned a lot about the possibilities of digital handwriting transcription. Thanks to the editors for the chance to read it! If you have any questions about my comments or need any clarifications, don’t hesitate to reach out.
Best,
Megan
This lesson is now under review and can be read at: https://github.com/jeffblackadar/mre/blob/main/docs/en/lessons/transcribe_handwriting_2.md
Thank you very much @mdermentzi and @mkane968 for this thoughtful and detailed feedback. I have made some replies below. -Jeff
Hi @jeffblackadar, Thank you for this very useful tutorial. You’ve made it accessible enough so that someone without prior experience with Python or handwritten text recognition can follow along and start transcribing handwritten documents with minimal effort. My view is that it will be valuable to historians as well as archivists who need to perform such work for research or cataloging purposes. Once it’s published, I will definitely start using it and recommend it to the historians and archivists with whom I collaborate. I didn’t notice any serious issues with this tutorial. Most of my suggestions seek to simplify the structure or anticipate questions that beginners might have while following the steps. Mind you, I’m reviewing this lesson having just a week ago delivered a how-to workshop to historians with varying tech skills using Google Colab; seeing what they struggled with, my comments are aimed at ensuring that even complete beginners will find this tutorial as easy as possible. I didn’t focus on copy editing or compliance with the author guidelines, leaving these for the editors to check.
Overall, my main suggestion would be to make it clear early on that you recommend users follow this tutorial using Google Colab and prioritise this type of users throughout your instructions. This will make the prerequisites section more straightforward and the tutorial easier to follow. If you choose to do this, adding more screenshots of the Colab environment is also important.
JB: Great feedback.
Here’s my detailed feedback:
At the beginning of the tutorial, the reader might benefit from a short and concise Learning Objectives or learning goals section similar to other Programming Historian tutorials.
JB: I have added a short lesson objective.
Par 1, final sentence – add OCR abbreviation in this paragraph and from here onward use OCR
JB: Added (OCR), OCR is now used onward.
Paragraphs 2, 3 & 4: It is quite possible that I might have missed something but, having tried to find out what model is powering the Azure computer vision service showcased in this tutorial, my understanding is that Microsoft does not clarify what model architectures (or what datasets as the tutorial rightly points out) they have used to train the models powering their APIs. For this reason, a focus on CNNs might be a bit redundant; it is perhaps giving the idea that this is what is powering the Azure service, which may or may not be true. For example (and this is not my area of expertise–I only did a quick search for this so I could be wrong), in recent years, transformers have also been used for OCR. My suggestion, therefore, would be to remove direct mentions of CNNs, as they might additionally be alienating to beginners.
JB: That’s a great point.
If, however, the purpose of referring to the CNNs is to provide more context about the progress of this field, my suggestion would be to add a disclaimer clarifying that we don’t know how the Azure service, which is showcased later in the tutorial, works. Additionally, if this part is kept, I would suggest expanding on it a bit more, citing even key papers that led to relevant breakthroughs. From my experience working with historians in Europe, many of them have tried such tools before (granted, without having trained custom models) and are skeptical about their success, so recent advances in AI might encourage them to give handwriting recognition another go.
More detailed suggestions per paragraph:
Par 2: I’d recommend starting this paragraph with “Digitally transcribing [...]” It’d be helpful to remove the parentheses and better integrate the PH references in one or more sentences starting with something like “Previous programming historian tutorials that have demonstrated typed text recognition include: ”
Consider adding one more reference to the latest PH OCR lesson that uses the Google Vision API (https://programminghistorian.org/en/lessons/ocr-with-google-vision-and-tesseract) either here or in another paragraph.
And then you could continue the paragraph by adding the first sentence but with small changes, such as: Recent advances in artificial intelligence offer the ability for historians to automatically transcribe handwritten documents [...] In the bit where it says “within limits of types of letters used, language and legibility.”, the expression “types of letters” might read better and be more inclusive if changed to “writing systems”
JB: Done
Final sentence: Remove mention of CNN and add another disclaimer here to make it clear that this is only true for certain writing systems and languages so that readers won’t get disappointed if they get bad results when trying this with images including texts written in lower-resource languages and writing systems.
I would cut paragraph 3 and keep paragraph 4 but remove the CNN bit towards the end. To make up for removing these parts, you could add another sentence somewhere explaining that these models are only as good as the data on which they were trained and advising historians to keep in mind that their results will reflect their training data, with all the biases stemming from how and by whom the training dataset was put together.
JB: I had some earlier feedback to include this for context. This was to differentiate this approach from OCR and also provide a rationale for using this versus training a model from scratch.
Par 4:
“as long as these documents are recognizable to the service. ” – expand on what recognizable means in this context. For example, recognizable in terms of the writing system used, language, file type, etc. final two sentences, fix “is not” to are not. Might be best to cut from “I assume [...] property”. Final sentence, it’d be interesting to know what this assumption is based on. Is it based on personal experience using some of these services or is it based on how similar models of which we know the details work?
JB: It's based on personal experience. I haven't seen details of how these models are trained, but have read a general description. I've used the models from the three companies and made comparisons, but don't have statistics. I've run a bunch of different documents through the services and seen where they have worked well or failed. Some notes are here.
Par 6: In this paragraph, I would strongly recommend adding what languages are currently supported.
Par 7: It would be interesting to read what scripts and languages you’ve tried it with.
JB Answer: For Microsoft, Google and AWS I’ve only used English and French. With Google I’ve used it for Persian and Arabic – but I did not want to distract the reader by talking about Google’s service.
JB: I have added links to the supported languages
First requirement: I’d suggest changing to something like “Knowledge of Python is not required since all of the code is provided in the tutorial. That said, basic Python knowledge would be useful for users who wish to understand the code or to tweak it for their purposes.”
JB: Done
Second requirement: I’d suggest changing to “Google Colab, a web-based virtual Python programming platform, was used to write this lesson. If you choose to use Google Colab to program Python (recommended), a Google account is required. If you choose to run the code in this tutorial locally on your own machine, Python and pipneed to be installed.” Also, it would be good to check if there is a specific version of Python required (I think it's 3+) and, if so, add this to the text. Perhaps add another footnote here to point to the python-sdk quickstart guide found later within the text. Fourth requirement: Change to “Credit card or debit card” so that those with no access to credit are not discouraged. Consider whether you want to address users who are already familiar with Google Colab or not. If familiarity with Google Colab is not listed in this section, there could be more screenshots and explanations about how to create new cells and run cells in Google Colab after paragraph 39. My recommendation would be to simply add more screenshots and instructions, because this tutorial could be easily followed by beginners as long as they’re not getting confused by simple steps that might, instead, come intuitively to more experienced users.
The procedure section (par 8) followed by the separate Images to transcribe section (par 9) is making the structure of this tutorial slightly confusing. I would suggest either flipping the two sections or making steps 5 and 6 of the procedure section subsections of a parent section called “Transcribe handwriting”, which would start with the “Images to transcribe section” subsection.
So perhaps a better structure would be:
JB: Thanks for this, done.
Par 9: In the Images to transcribe section, I would start by saying that “Microsoft’s Azure Cognitive Services require that images used [...]”.
JB: Done.
Par 10-16: Create a “Computer Vision” Resource in Azure to perform transcription. When following this process, I didn’t get a “start with an Azure free trial” message. Instead, I got a “Checking on your subscription” message and then Azure asked me to upgrade my account. Apparently, I was not eligible for an Azure free account, and so I had to sign up for Azure with the pay-as-you-go pricing. This didn’t imply that I actually had to pay anything, but it felt unclear and intimidating. Therefore, it might be useful to update the text of the tutorial so as to include this as a potential scenario for those who don’t see the Azure free trial prompt and clarify that they won’t actually get charged, because there are free quotas available in the pay-as-you-go subscription (unless they have already spent them).
Par 20: Instead of Azure subscription 1, there was a second option “Free trial”, which is the one that I selected. I can see that there have been many months since this tutorial was first submitted, so it might be worth going over the process once again to check if these instructions are up to date. The rest of the instructions including Par 22 were correct. (Pricing tier to Free F0 etc)
JB: Edits made.
Par 28-29: Here, beginners would benefit from more information on what an endpoint and keys are. Consider adding a footnote to offer some context.
JB: Done.
Par 30: This paragraph might be confusing to users who’ve been following the tutorial using Colab and have not created any folders. Also, perhaps it’d be more straightforward to start the paragraph with the sentence that is currently last in this paragraph and make the distinction of what users need to do depending on whether they’re using Github or not. In any case, make it clear that these keys are not meant to be shared with anyone under any circumstances. Also, consider integrating this paragraph later in the text into par 34, where users are asked to copy KEY 1.
JB: I agree, I have edited this
Par 36 Make Colab link clickable. Par ends with duplicate closing parentheses.
JB:Done
Par 38 onward: I’d recommend prepending “Colab” before every instance of the word “notebook” in the remainder of this text to avoid confusion and make it clear that the instructions are tailored mainly to Colab users. Par 39 Keep in mind that users might lack familiarity with Google Colab. Statements that might be intuitive to some, such as “Create a new cell” or “run a cell”, might not be as obvious to the uninitiated. Within the body of the text in this instruction, specify that, after copying the code, readers must also change the currently existing endpoint in the code to their own endpoint that they’ve previously copied from the Azure environment and make sure that it will be enclosed in quotation marks. Perhaps a screenshot from the Google Colab environment would be helpful here.
JB:Done
Par 40: Explain how one might run the cell. Also, specify that after running this cell they will get prompted for their secret computer vision key (KEY 1), which they need to paste inside the input box, and that they’re expected to hit Enter.
JB:Done
Par 41 Perhaps add another screenshot here. Also, it might be helpful to explain what they should do if they get an error. Should they rerun the cell? If so, add it to the text or to the error message in the code.
JB:Done
Par 42 I would say prioritize users running this on Google Colab as the preferred way to follow this tutorial and consider removing “on your machine” from the title to avoid any confusion that readers might actually need to install something on their devices. Flip the order of the two final sentences. Consider adding a footnote on what a session is (although not important). Also, for users who run this locally, consider flagging that if the pip install line is not run through a notebook but rather on the command line, then they should remove the exclamation mark. The previous comment about how to create a new cell is also applicable here.
Par 43 The previous comment about how to create a new cell is also applicable here. JB: should be fixed now.
Par 44 Is this a public domain image and is it OK copyright-wise for others to use it while following the tutorial?
JB Answer: I took the photograph - I will ask to save it with PH.
If so, it’d be a good idea to mention it here so that readers know it’s safe to use it. Perhaps coordinate with the PH team to save it under their domain to ensure greater chances of sustainability for this tutorial and don’t forget to update the links. Also, what happens to the images that are getting processed by the Azure Computer Vision API? In certain cases, researchers might not be permitted to transfer their data to third parties. Therefore, it might be a good idea to add a disclaimer here or a link detailing how Azure is processing data sent to them through this kind of APIs.
JB: I've added a note under Image requirements
Par 46 In this paragraph, I would suggest adding one more sentence to explicitly say that if readers want to try this method with their other images stored online they should replace the existing link after the comment “# Get an image with text. Set the URL of the image to transcribe.” with the link to the image that they’ve found online (and are permitted to use) in quotes. Alternatively, the same note could be added at the end of par 48.
JB: Added a note
Par 48: Consider expanding on what “Call Azure using computervision_client with the URL.” means. Beginners might not be familiar with API calls. Consider adding a screenshot of the result and commenting on it. This will not only help users know what to expect but will also give them a sense of how accurate this method can be. Par 49 Same note as above regarding permission to use the image. JB: Added a note
Par 50 Consider adding a screenshot to show how one might do this in Colab. Sometimes the vertical bar on the left can easily go unnoticed. Par 54 Add a note that Colab users need not change this. Par 54-56 There seems to be something wrong with markdown here. Make sure the post appears as intended. JB:Fixed
At this point, as a more experienced user, I would be interested to know whether there are any parameters that I can tweak when making the API calls (such as the language that I’m interested in transcribing) to get more accurate results. Consider adding a link that will point more advanced users to further documentation.
Finally, I’d recommend this lesson to be aimed at beginners (provided that they reproduce it using the Google Colab route). This tutorial is an important and enjoyable read; congratulations, and thanks a lot to the editors for giving me the chance to review it. Kind regards, Maria @giuliataurino PS: Feel free to let me know if you have any questions or need any clarifications when it comes to my feedback.
Thanks for this Maria! Jeff
Hi @jeffblackadar and @giuliataurino, Thank you for the opportunity to read this tutorial as a second reviewer! As someone with limited familiarity with handwriting transcription, I found the directions easy to follow and the code simple and efficient to use. I think it will become a valuable resource for historians. Like @mdermentzi, most of my comments are related to structure, or in anticipation of questions a beginner-level audience might have about Microsoft Azure and Google Colab.
P1-Your introduction clearly sets up the need for digital handwriting transcription. Though the historians you’re speaking to might not require much convincing, I’d love to see a tangible example of a handwritten document which would be beneficial to transcribe.
JB: Added Sources such as diaries, letters, logbooks and reports
I’d also recommend cutting off that paragraph at the second-to-last sentence and shifting your focus to digitization in the next, given that you spend some time coming back to this in p2. Along those lines, I’d echo @mdermentzi's comments to hone the focus of the first 2-3 paragraphs with your endpoint of working with Microsoft Azure in mind. I am also unfamiliar with the software, but from a quick search I found these descriptions on Microsoft’s website related to how their Computer Vision works: “The image is then sent to an interpreting device. The interpreting device uses pattern recognition to break the image down, compare the patterns in the image against its library of known patterns, and determine if any of the content in the image is a match.” “With deep learning, a computer vision application runs on a type of algorithm called a neural network, which allows it deliver even more accurate analyses of images."
(Source: https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-computer-vision/)
To me, this seems like a cross between the traditional OCR approaches you are describing in p2 and the CNNs in p3–though the term “interpreting device” is pretty nebulous! That said, you might do well to position your discussion in p2-3 as a general one, related to the various approaches of OCR, perhaps emphasizing how machine learning methods generally perform better than and/or enhance dictionary-based methods given the complexities of handwriting analysis (if that’s indeed the case). I’d recommend clarifying the relationship between CNNs and OCR too (CNNs are used for OCR, right? They’re not two separate technologies?) In any case, your contextual discussion could feed into a note at the start of p4 that acknowledges how commercial services use a combination of these methods, however transparent (or not) they make their approaches.
JB: Q: CNNs are used for OCR, right? They’re not two separate technologies?
JB Answer: OCR is based on different technology that predates CNNs and Transformers. Traditional OCR is about recognizing printed letters based on patterns humans have coded versus a CNN where a computer has learned itself to generally recognize a pattern present in images it is trained on.
JB: Thanks for the feedback above, I have changed the language to use the terms deep learning and computer. This may make this more approachable. I wanted to set context that this is different than OCR and that’s why it’s worth considering as a tool.
In paragraphs 4-6, you acknowledge a lot of important limitations to handwriting transcription. Speaking again as someone outside this field, it did seem like the emphasis was more on the limits than the benefits of this type of analysis. Perhaps this focus on critique is just realistic, and/or perhaps the benefits are implied, but could you expand any further on why, specifically, Microsoft Azure is a viable option for this type of work? I personally don’t have a frame of reference for Google/AWS accuracy and would be curious to hear more about it–what are the benchmarks beginners should look for when evaluating this type of service, and how does Azure measure up? This could be something you integrate later in the tutorial below–see my comments about discussing the sample image output.
JB: I was being too realistic. I want to convey that it works well for handwritten document but not on everything (to avoid not meeting high expectations). Unfortunately, I don’t have product comparison benchmarks but looked. I saw one other website that did a comparison, and I did some work on that as well, but without statistics.
P6-7-You introduce the tool as “Microsoft Azure Cognitive Services,” and I think a little more information about what the platform is could be helpful. Having mostly worked with Abbyy before, I was envisioning a desktop app from Microsoft, but obviously that’s not what it is. More context about Azure (and “Computer Vision” as a resource within it) might also make the last line of p7 clearer–you’re saying there’s documentation around it, but not around the coding aspect of it?
JB Answer: There is a tutorial from Microsoft about how to write a Python program to access this. The PH tutorial is meant to be a more accessible tutorial.
And/or are there other coding platforms that its use has been documented for, but you’re contributing with a Python tutorial?
JB Answer: Other languages can be used, I’ve selected Python here since it’s fairly popular on PH.
Leaning into a focus on Colab users, I think, would help here, as you’d be positioning this tutorial as a simple-to-use pipeline that doesn’t require any purchases or local software downloads.
P7/Prerequisites–access to an internet connection seems to be implied, given the other prerequisites listed. Clarify that you don’t need to install Python on your machine if you’re using Google Colab, and point toward Colab tutorials here. You might also want to give context for the telephone number, like the credit/debit card, since it’s somewhat unusual. You might also want to clarify in parenthesis that “Though there is a free tier of service for Microsoft, you are required to put a credit card on file.”
JB: done
In Procedures (P8), perhaps clarify what you mean by “install Azure on your machine.” This too made me think Azure was a desktop platform like Abbyy, but it’s actually something to be installed in a coding environment, and doesn’t even need to be installed locally if you’re using Colab. Same with “access Computer Vision” from your machine”. Especially since you are using “stored on your machine” to reference a locally stored file in step 6, just tweak the use of these terms above. I second @mdermentzi's comments to nest the images to transcribe within your procedures. This section could be more readable/skimmable if you structured it in a bulleted list, like as follows: Image Requirements: • Acceptable Formats: JPEG, PNG, GIF, BMP • Min Size: 50 x 50 px (how many GB/MB?) • Max size: 4 MB (how many px?)
JB: I've made most changes above. With different image formats, specifying the file size of a 50x50 image or the number of pixels of a 4mb image is variable. I’ll stay away from this rather than be wrong most of the time.
I’m not sure you need all the sentences about conversion (as you acknowledge, it’s outside your scope and would assumedly be an implied step) and you could still put the caveats about experimentation below the list.
JB: Will do.
P9-It might be smoother to have a line before you start the numbered directions saying, “If you already have a personal Microsoft account, skip this section.” Along those lines, perhaps clarify here that you need a PERSONAL account, rather than a school/organizational one. As noted below, I ran into trouble trying to use a school account for this because I could not change my access to the feature and input a credit card.
P9–it might be more straightforward to direct users to the general Microsoft login in page (https://account.microsoft.com/account/Account) to register for an account, especially since the first step of step 2 is to again go to portal.azure.com.
JB: I got different behavior when I tested this. I made a new account with the above link, but http://portal.azure.com/ didn't know about it after. I'm going to stick with the original link since that works with testing
P10 to 16-When I tried to sign in with my personal Microsoft account, the steps you outlined worked perfectly. However, when I tried with my school account, I got an error message stating the feature was disabled through my school’s subscription. Not sure how common of an issue this would be, but perhaps put a disclaimer to make sure to use a personal account, rather than a school/organization account for this process.
JB: I’ve made a note about this
P22–The plus signs breaking up your sentences make it a little choppy to read, so perhaps just remove pulse sides and say “Select a region, name the instance")
JB: done
P24–Before I clicked review, I saw that I also had the option to set the network, identity and tags for project. I understand these are set by default, but maybe talk through the options/why they’re important/why you can just leave them alone?
JB I added: The "Identity" and "Tags" tabs can be left with default values. They are relevant only if you are using this in combination with other Microsoft Azure services.
P28–Clarify what you mean by “access this service through the computer” here as above.
JB: changed to "your python environment."
P30–I would also like more clarity on the function of the key and the endpoint. Why does it give you 2 keys, and why do you only need one of them? Why is it important to keep the key a secret?
P30–I’m not quite sure what you mean by “check your code into a repository.” Is this just another way to say upload? And what do you mean by “avoid checking the file”--just erasing the key before you upload it to the repository?
JB: I've made mistakes where I had keys in my code then send checked the code into GitHub. Then my keys were there for the world to see and use. I want to be careful to have people avoid that.
P35–List this as a regular paragraph, rather than the 4th step in this process–I nearly regenerated my key and endpoint after copying/pasting/saving them and would have had to do the whole process again.
JB: Thank you
P36-I’d recommend making 3B a completely separate step (4) since we are really switching topics to working in a Python environment now
JB:Done
P36–Say “Create a Google Colab (or Python) notebook in p 36 header and beyond P39–Consider giving a little more context about what an “environment variable” is, why to import os package, and what “basic validation” means.
JB: Noted the use of os and environment variables. I did not want to go too deep on that though.
P40–Clarify that you have to run the cell and then copy/paste your key into it before the output is generated.
JB: Done
P41–Why is it important to delete the text of your key?
JB: It's just anther way to prevent someone else from copying it from a print out
P42–Yes, again clarify that you’re installing it in a Python environment, and that it’s not a local install when using Google Colab. P43–Perhaps say “the code below” instead of “this code”; I thought you were referring to the code above, as there’s a bulleted list between the instruction and the code. You might give some more info on what libraries and authentication processes are for people not familiar with them.
JB: Changing to the code below
P44–If possible, I’d be interested in learning more about the sample image, the challenges of transcribing it manually, and why you think computer vision would be valuable–in a sense, a tie-in to the demands you address at the beginning of this tutorial. If you’re bringing the image paragraph down here, you could also note why (or why not) this image is an ideal one to work with
P46–”Create” rather than “open” a new cell.
JB: Done
P46–It was helpful to see the bulleted list of outcomes in P43, before you ran that code cell. Here you tell users to run it, and then go back and describe each line below, but I’m wondering if reversing those things would be more instructive.
JB: I've changed this
P48–Clarify that you have to change the URL to the image you are transcribing
JB: Done
P48–Could you include screenshots or code for the last two steps (read the results line by line and print the text if successful)? I think it would be particularly helpful to see what you mean by “the coordinates of a rectangle” and why that’s valuable information to have.
JB: I've added the sample output
P48–Not sure how complicated this would be to add, but I do think an additional step where you discuss how to export the results would be useful. Even if it’s just a note to copy/paste the text into a txt file, or if there’s a more readable format you could generate that also stores the lines separate from the coordinates (for readability).
JB: I've added two more steps to export the data 6.iv and 6.v
P48–It might also be helpful to put the output and your file side-by-side here, to see how the output compares to the text, and reflect briefly on its accuracy and value. Especially as you discussed critiques of the process in the beginning, it could be helpful to model your own process for discerning what is useful and what is limiting about this tool.
JB: That's a good idea.
P50–Consider providing more detailed guidance (and screenshots) for new Colab users who may not know how to upload file to a directory.
JB: Added a screenshot
P55–I think this is supposed to be code, next line is supposed to be text, then code again (just adjust formatting of blocks)
JB: Fixed
P56–Same as above, share code/screenshots for last two bullet points and consider a brief model of evaluating output.
JB: Fixed
P57– I was left wondering how to go from your code to the next steps you described (processing multiple images, storing transcribed text in a file/database).
JB: I've added an export to a file - I hope this fits
I know this is a beginner tutorial, and I’m not sure how complicated it would be to add any of these steps, but even saving output in a file seems like a valuable addition. Additionally, it might be interesting to share a different sample image when you walk through the process of transcribing a local image, perhaps a map or a spreadsheet like you are describing above, so you have another type of file to show output of. Just a suggestion, and it shows that your tutorial is piquing my interest about the capabilities of this tool.
JB: Giving this thought for getting a usable, authorized image
P58–What do you mean by “customize the training” and why isn’t it possible at this time?
JB: Answer: I was meaning I can't customize the training of the handwriting styles, great call, I've edited this
Also, this is purely subjective, but consider ending on an even stronger note! Your tutorial makes me intrigued about the possibilities of this type of analysis and I think you could say more on this here–for example, you could circle back to specific use cases or reflect on how exactly it could continue to grow (if this is something you’re excited about).
JB: I will look to edit this a bit more, it is pretty amazing.
Overall, I really enjoyed reading this tutorial and learned a lot about the possibilities of digital handwriting transcription. Thanks to the editors for the chance to read it! If you have any questions about my comments or need any clarifications, don’t hesitate to reach out. Best, Megan
-- Thanks for this Megan! Jeff
Hi, @giuliataurino, @mdermentzi and @mkane968, - Thank you for the feedback! I have made revisions and I will look for remaining errors. I believe I have addressed the items in the feedback. I have reorganized the sections, added a bit more about Google Colab, edited the introduction to remove CNN, added the ability to save results to a file and added a link to more documentation. Thanks again for your reviews, they have made this a better organized and more complete tutorial.
A question I have is, is there a place I should put the page images I've made for people to download and is there something I need to do to designate them as open access?
Thanks Jeff
Hello @jeffblackadar.
We can save any downloads which accompany your lesson in our assets repository where readers will be able to access them. Could you email them to me? admin@programminghistorian.org.
Thank you.
Hi, I just wanted to check on the status of putting these files in the assets repository please - I think it's one of the last things we need to do.
I'll just resend this: Here please are some files for the assets repository. These are images I photographed.
The Library and Archives Canada has now published William White's diary here:
White, William. 1917. William Andrew White fonds, R15535-0-8-E, "1917 Diary", Item ID number 4818067. Library and Archives Canada. http://central.bac-lac.gc.ca/.redirect?app=fonandcol&id=4818067&lang=eng. Accessed August 18, 2023.
Thanks, Jeff
On Fri, Aug 18, 2023 at 5:49 PM Anisa Hawes @.***> wrote:
Hello @jeffblackadar https://github.com/jeffblackadar.
We can save any downloads which accompany your lesson in our assets repository where readers will be able to access them. Could you email them to me? @.***
Thank you.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1684467917, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET7IEBEOJB652EKVCQLXV7PORANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Hello @jeffblackadar. Thank you for your follow up. I've emailed you as I have a couple of questions/points to clarify about these image assets. Very best, Anisa
Thanks very much Anisa, I have forwarded answers. Best regards, Jeff
On Fri, Sep 22, 2023 at 7:04 AM Anisa Hawes @.***> wrote:
Hello @jeffblackadar https://github.com/jeffblackadar. Thank you for your follow up. I've emailed you as I have a couple of questions/points to clarify about these image assets. Very best, Anisa
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1731231772, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET2FZLSMDHBEOYFGMMTX3VWEFANCNFSM6AAAAAAQ66PUCE . You are receiving this because you were mentioned.Message ID: @.***>
Thank you, @jeffblackadar. As discussed by email, I've zipped the 4 sample images you've selected and have uploaded them to our assets repository.
Our next step is copyediting (our Publishing Assistant, Charlotte, will be working on this). It is important that no further changes are made to the .md
file while Charlotte is working. She will be in touch with any questions that arise, and I estimate that this will be ~12th October.
Very best, Anisa
Hello @jeffblackadar and @hawc2 (in place of Giulia for now). I've prepared the copyedits for this lesson to commit in PR #591. I'd be grateful if you could review the adjustments and confirm that you are happy for me to merge these. You can see the details of my edits here! Sorry if it appears slightly confusing – one of the changes involved changing the lesson title and slug to "transcribing-handwritten-text-with-python-and-azure".
You can respond to any of my suggestions via the comments, or click to Resolve conversation if you're happy with them like this:
If you want to make edits, please work here, accessing the edit facility by clicking the three dots at upper right of the file named transcribing-handwritten-text-with-python-and-azure:
These look good to me @charlottejmc. Once these changes have been approved, I have some suggestions for further revision @jeffblackadar that I can delineate for you to do before we publish
Hello @jeffblackadar and @hawc2,
Thank you for reviewing Charlotte's copyedit PR. I've merged this in so that you read the updated lesson in the web preview:
--
When you are both happy, Charlotte and I will take this lesson through its next steps:
Very best, Anisa
Hi Anisa
This looks great, thank you. A couple small things:
P55 - small edit needed below for has transcribes the program has transcribes “prote Inier” instead of “wrote Izie”.
P56 Since the file to be downloaded is now called td_00040_b2.jpeg, the code has an error when it's run. We should change the code in P60 and P66 for the new file name or use the td_00044_b2.jpg file name for the download (matches the image) (I recommend changing the file name of the file to be downloaded to td_00044_b2.jpg)
read_image_path = os.path.join(images_folder, "td_00044_b2.jpg")
P68 This link to download images does not work yet, unlike the images download image link in P56. (the code ran as expected)
Best regards Jeff
On Fri, Oct 20, 2023 at 1:01 PM Anisa Hawes @.***> wrote:
Hello @jeffblackadar https://github.com/jeffblackadar and @hawc2 https://github.com/hawc2,
Thank you for reviewing Charlotte's copyedit PR. I've merged this in so that you read the updated lesson in the web preview:
--
When you are both happy, Charlotte and I will take this lesson through its next steps:
- Typesetting + final checks of metadata
- Generating archival hyperlinks
Very best, Anisa
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1773086992, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETZ3IEJUCY2KA4TZZVDYAKU67AVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTGA4DMOJZGI . You are receiving this because you were mentioned.Message ID: @.***>
Hi Jeff,
Thank you for flagging these up!
I have made these changes in the file. I did as you suggested and redirected the link to download to show td_00044_b2.jpg
instead of td_00040_b2.jpeg
.
I also created a new zip file in the assets directory containing all the images, and redirected the link to download at P68 straight to that.
The lesson is now ready for typesetting, which I will be working on in the next two days. I hope to have it ready for you by the end of Friday.
Best, Charlotte
Hello @hawc2 ,
This lesson's sustainability + accessibility checks are in progress.
Publisher's sustainability + accessibility actions:
Authorial / editorial input to YAML:
difficulty:
, based on the criteria set out hereactivity:
this lesson supports (acquiring, transforming, analysing, presenting, or sustaining) – I've suggested transforming
but let me know what you think.topics:
(apis, python, data-management, data-manipulation, distant-reading, set-up, linked-open-data, mapping, network-analysis, web-scraping, digital-publishing, r, or maching-learning) Choose one or more. Let us know if you'd like us to add a new topicalt-text
for all figures – Figures 3 and 7 are missing their alt-text description!abstract:
for the lessonThe image must be:
- copyright-free
- non-offensive
- an illustration (not a photograph)
- at least 200 pixels width and height Image collections of the British Library, Internet Archive Book Images, Library of Congress Maps or the Virtual Manuscript Library of >Switzerland are useful places to search
avatar_alt:
(visual description of that thumbnail image)ph_authors.yml
using this template:- name: Jeff Blackadar
orcid: 0000-0002-8160-0942
team: false
bio:
en: |
Jeff Blackadar has a Master of Arts in History with a specialization in Data Science from Carleton University.
Files to prepare for transfer to Jekyll:
EN:
Promotion:
Thanks for this Charlotte
In case, I am sending my ORCID: https://orcid.org/0000-0002-8160-0942
Best regards Jeff
On Wed, Oct 25, 2023 at 11:41 AM charlottejmc @.***> wrote:
Hi Jeff,
Thank you for flagging these up!
I have made these changes in the file. I did as you suggested and redirected the link to download to show td_00044_b2.jpg instead of td_00040_b2.jpeg.
I also created a new zip file in the assets directory containing all the images, and redirected the link to download at P68 straight to that.
The lesson is now ready for typesetting, which I will be working on in the next two days. I hope to have it ready for you by the end of Friday.
Best, Charlotte
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1779560699, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETZ3EICS36JHSPQQ5U3YBEXJFAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZZGU3DANRZHE . You are receiving this because you were mentioned.Message ID: @.***>
Hi @jeffblackadar,
I had a little look around for an avatar and came across this one which could look like this, cropped and greyscaled:
What do you think? Please also feel free to find one on your side, if you have a different idea!
I love it - it works really well with the content. Thank you for finding it @charlottejmc, Jeff
On Fri, Oct 27, 2023 at 7:45 AM charlottejmc @.***> wrote:
Hi @jeffblackadar https://github.com/jeffblackadar,
I had a little look around for an avatar and came across this one https://www.loc.gov/item/2004662405/ which could look like this, cropped and greyscaled: [image: transcribing-handwritten-text-with-python-and-azure] https://user-images.githubusercontent.com/143802849/278631579-ce406bb5-a62b-459d-a6c3-56e835020cbc.png
What do you think? Please also feel free to find one on your side, if you have a different idea!
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1782775223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET2FKPEXDV5SIRZ7CV3YBONGPAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSG43TKMRSGM . You are receiving this because you were mentioned.Message ID: @.***>
Hi @jeffblackadar, I've had a chance to read through your tutorial and make some line edits before we move forward with publication. I have a series of minor revision requests that I think will help flesh this tutorial out a little more and make it more in keeping with most of our Programming Historian lessons.
The introduction was rather long, so I made a few attempts at breaking it up into sections. Feel free to change those if they don't represent what you intended. There are a couple places in the introduction where it could be useful if you give more information - for instance, you bring up related PH lessons, like the one on Google Vision, but you don't really say how they are connected. That one seems especially relevant to compare in some way with your tutorial, just a couple sentences explaining how the reader can see the two in tandem. Note I moved that section down to Prerequisites, where it makes more sense.
One concern we have about this lesson is it depends on "commercial" software. Perhaps you could say a little more early on about any free trials or ways people might demo this software without a cost (for instance, when you first mention it is a commercial software). I see under Prerequisites you mention the free tier in relation to the credit card being required, but it might be worth stating that earlier as well.
It is also worth explaining when you bring up the range of commercial options, why this particular process is so popular for commercial purposes, and what proprietary commercial transcription softwares offer over and above open-source ones. After explaining the advantages of commercial software, it would be helpful to point readers to free opensource options before digging into this commercial option. One problem with commercial software and tutorials like this is they are often unsustainable, since the companies change them so often and there's no opensource option to point to. Anything you can say about that in the tutorial would also be worth bringing up in relationship to the 'commercial' service.
We typically prefer for Programming Historian lessons to explore some kind of research question. It might help if you say a little more about how Historians can use transcription to pursue actual research projects, giving concrete examples. With your sample datasets/projects, just make a few nods throughout to the research questions this transcription process can make easier to answer.
For example, in the case of "Working with an image found online," is there anything about the particular image you can say represents a specific historical period or style that the transcription process could help the historian explore? Just a sentence or two would be nice to link it to the broader point of the lesson. I see you put an Endnote here where you say: "This is an image from the 1917 wartime diary of Captain William Andrew White photographed by the author during research." I wonder why this is a footnote? Seems like a relevant historical detail to talk about in terms of how this transcription process relates to historical research.
As an example, this lesson is much more step by step than most we publish, so there are lots of opportunities at the beginning of each section where you could add a little commentary, providing broader context for the purposes of each step. Instead of each section being only an alphabetized list of steps, you could add a few comments to introduce each section in paragraph form. This is what you do for Step 3, for example. But the section "Installing Azure Computer Vision in your Python environment" only begins with a link to a Microsoft resource, but you could also take moment to spell out for the reader here: "In this next section, we will install the Azure libraries in the Python environment we've created. You could take a second to explain some of the rudimentary methodological lessons relevant here, relating to Python package management and virtual environments.
I don't think this should require a lot of time-consuming revision. Mostly I think you could just weave a few more moments of commentary and discussion throughout the piece, and these commentaries can also serve as context and signposting for a reader to get oriented around a series of tutorial steps. It is helpful to remind the reader how this relates to research questions, but it is most important that there is some gesture towards the purposes of transcription in the Introduction and Conclusion. Right now, the ending, "Summary," doesn't really provide any conclusive claims about what you've taught and how it's useful to historical research. Feel free to take a little more time to explain the implications to the reader!
As you make these last revisions, be mindful that the most important lessons for the reader are methodological, relating to transcription and historical research. For example, when you are teaching about using "Keys" that's a good opportunity to generalize about this essential aspect of using cloud services for historical research. Secondarily, it is most important that through this lesson you teach the reader about Python more so than Microsoft products. If there are ways you can add commentary about the Python steps you are teaching, and the related concepts, that would be welcome. For instance, alot of your code has commented out sections with information about what the code does. In some of those cases, you could add commentary before or after the code chunk where you explain what this section of code does, and what important coding concepts are important to be aware of.
Once you make these minor additions, we can move forward with publishing this lesson. Thanks for your work finalizing it!
Hi @hawc2 (and @jeffblackadar), thank you for this detailed comment! Just quickly jumping in to note that the footnote [^1] ("This is an image from the 1917 wartime diary of Captain William Andrew White photographed by the author during research.") was a choice made in our copyedit. You can see this change was made with this link at lines 219 (green) and 539 (green). This was part of an effort to tidy up the Bibliography and Endnotes, keeping source links out of the main text.
Thank you, @charlottejmc. That makes good sense.
I agree that if this endnote was reinstated as a line within the lesson reading _This is an image from the 1917 wartime diary of Captain William Andrew White_ it would be good to extend the sentence (and/or add one or two) to surface that taking photographs of handwritten documents when you're doing archival research is a good example of a scenario where you might want to perform automatic transcription to help save time. I think this is an interesting piece of contextual information about your experience of doing research, @jeffblackadar.
Hi @hawc2 https://github.com/hawc2, @charlottejmc https://github.com/charlottejmc and @Anisa Hawes @.***>,
Thanks very much for your review and feedback. I have made edits. To address the points I've used a Q:/A: structure to briefly note the question or point and how I addressed it.
Q: Introduction and breaking it up into sections. A: This reads well to me - thank you for the restructuring.
Q: Related lessons, making a connection with Isabelle Gribomont's ["OCR with Google Vision API and Tesseract"] A: New text: A related lesson is Isabelle Gribomont's "OCR with Google Vision API and Tesseract". That lesson provides a method to combine Google Cloud Platform’s character recognition with Tesseract’s layout detection. It is possible that Tesseract’s layout detection capability could be combined with Microsoft Azure's Cognitive Services' handwriting recognition to improve the structure of the transcribed text.
Q: One concern we have about this lesson is it depends on "commercial" software. Perhaps you could say a little more early on about any free trials or ways people might demo this software without a cost (for instance, when you first mention it is a commercial software). I see under Prerequisites you mention the free tier in relation to the credit card being required, but it might be worth stating that earlier as well. A: I've added "Microsoft Azure's Cognitive Services has a free tier of service available."
Q: After explaining the advantages of commercial software, it would be helpful to point readers to free opensource options before digging into this commercial option. A: I have found no free open source options for an already trained handwriting recognition service. Fortunately, the Microsoft option is free for less than 5000 images a month.
Q: One problem with commercial software and tutorials like this is they are often unsustainable, since the companies change them so often and there's no opensource option to point to. Anything you can say about that in the tutorial would also be worth bringing up in relationship to the 'commercial' service. A: This is a good point and something I considered when starting this lesson. I can say I have been using this service for about 4 years so it has a good track record. You're right though, a time will come when this will change.
Q: Photograph citation, context and explanation of research: A: New text: This is an image from the 1917 wartime diary of Captain William Andrew White photographed by the author during research. This research involved text analysis with natural language processing to extract, catalog and relate the names of the people, locations and organizations that appeared in the diary. To do the research it was necessary to transcribe the diary into digital form.
Q: At the beginning of each section add a little commentary A: An introduction is given for each step now.
Q: For "Keys" there is a good opportunity to generalize about this essential aspect of using cloud services for historical research A: A note about the general use of keys and the use of a URL is added.
Q: Commentary before or after the code chunk where you explain what this section of code does, and what important coding concepts are important to be aware of. A: Some of the explanation that was there was edited out because comments were also in the code. This makes sense to me since the comments explain what is happening in context with the program while there is a point for explanation of the program.
I hope I am close to addressing the items you're looking for. Best regards, Jeff
On Fri, Nov 10, 2023 at 8:15 AM Anisa Hawes @.***> wrote:
Thank you, @charlottejmc https://github.com/charlottejmc. That makes good sense.
I agree that if this endnote was reinstated as a line within the lesson reading This is an image from the 1917 wartime diary of Captain William Andrew White http://www.biographi.ca/en/bio/white_william_andrew_16E.html it would be good to extend the sentence (and/or add one or two) to surface that taking photographs of handwritten documents when you're doing archival research is a good example of a scenario where you might want to perform automatic transcription to help save time. I think this is an interesting piece of contextual information about your experience of doing research, @jeffblackadar https://github.com/jeffblackadar.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1805707935, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETY2ZWPLEFPLHUAL5G3YDYSGDAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVG4YDOOJTGU . You are receiving this because you were mentioned.Message ID: @.***>
Thank you very much @jeffblackadar for all those changes – we appreciate your work on this! I have gone through and applied some copyedits, which you can see by clicking on this link to read the "rich diff" in detail.
I'll take this opportunity to reiterate the points we still need from the checklist in my comment above:
activity:
this lesson supports (acquiring, transforming, analysing, presenting, or sustaining) -
I've suggested transforming
but let me know what you think.topics:
(apis, python, data-management, data-manipulation, distant-reading, set-up, linked-open-data, mapping, network-analysis, web-scraping, digital-publishing, r, or maching-learning) - Choose one or more. Let us know if you'd like us to add a new topicalt-text
for all figures – Figures 3 and 7 are missing their alt-text description!abstract:
for the lessonavatar_alt:
(visual description of that thumbnail image)Thanks again!
Hi @charlottejmc
Sorry I missed the checklist, some replies:
Receipt of author(s) copyright agreement – here is the form https://programminghistorian.org/assets/forms/Authorial-copyright-and-publishing-rights.pdf which you can fill out and send to my email address publishing.assistant [@] programminghistorian.org.
Jeff: The form is sent.
Define the research activity: this lesson supports (acquiring, transforming, analysing, presenting, or sustaining) - I've suggested transforming but let me know what you think.
Jeff: I agree, transforming is good.
Define the lesson's topics: (apis, python, data-management, data-manipulation, distant-reading, set-up, linked-open-data, mapping, network-analysis, web-scraping, digital-publishing, r, or maching-learning)
Choose one or more. Let us know if you'd like us to add a new topic
Jeff: I suggest: python, apis, data-manipulation - This matches Isabelle Gribomont's lesson
Provide alt-text for all figures – Figures 3 and 7 are missing their alt-text description!
Jeff: I checked the figures and see alt text. (Thank you if someone edited this)
Provide a short abstract: for the lesson
Tools for machine transcription of handwriting are practical and labour-saving if you need to analyse or present text in digital form. This lesson will explain how to write a Python program to transcribe handwritten documents using Microsoft's Azure Cognitive Services, a commercially available service that has a cost-free option for low volumes of use. Google Colab is used as the example Python programming environment.
Provide avatar_alt: (visual description of that thumbnail image)
Jeff: I like the one you found (this one https://www.loc.gov/item/2004662405/)
Prepare x2 posts for future promotion via our social media channels – this is usually in the hands of @hawc2 https://github.com/hawc2
Thanks Jeff
On Wed, Nov 22, 2023 at 10:29 AM charlottejmc @.***> wrote:
Thank you very much @jeffblackadar https://github.com/jeffblackadar for all those changes – we appreciate your work on this! I have gone through and applied some copyedits, which you can see by clicking on this link https://github.com/programminghistorian/ph-submissions/commit/e4e87d140851c5be673dabb8b490c9f84436390a to read the "rich diff" in detail.
I'll take this opportunity to reiterate the points we still need from the checklist in my comment https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1781433486 above:
- Receipt of author(s) copyright agreement – here is the form https://programminghistorian.org/assets/forms/Authorial-copyright-and-publishing-rights.pdf which you can fill out and send to my email address publishing.assistant [@] programminghistorian.org.
- Define the research activity: this lesson supports (acquiring, transforming, analysing, presenting, or sustaining) - I've suggested transforming but let me know what you think.
- Define the lesson's topics: (apis, python, data-management, data-manipulation, distant-reading, set-up, linked-open-data, mapping, network-analysis, web-scraping, digital-publishing, r, or maching-learning)
- Choose one or more. Let us know if you'd like us to add a new topic
- Provide alt-text for all figures – Figures 3 and 7 are missing their alt-text description!
- Provide a short abstract: for the lesson
- Provide avatar_alt: (visual description of that thumbnail image)
- Prepare x2 posts for future promotion via our social media channels – this is usually in the hands of @hawc2 https://github.com/hawc2
Thanks again!
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1822990262, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUETZ4RPLEU3SJJYQQRR3YFYK7JAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRSHE4TAMRWGI . You are receiving this because you were mentioned.Message ID: @.***>
Hello @jeffblackadar, thank you very much for getting back to me on those.
I've received the copyright agreement and added [python, apis, data-manipulation] as the topics.
I think the alt-text you saw was (perhaps confusingly!) simply placeholder text (alt="Visual description of figure image"). I made some changes to the file to suggest instead:
I've also suggested the avatar_alt (visual description of lesson avatar) as "Drawing showing the design for the Youths progressive recorder, a mechanical handwriting copying machine."
If you're happy with those three descriptions, we can just keep them in!
Thanks again, Charlotte
This is great Charlotte Thank youJeffOn Nov 24, 2023, at 5:33 AM, charlottejmc @.***> wrote: Hello @jeffblackadar, thank you very much for getting back to me on those. I've received the copyright agreement and added [python, apis, data-manipulation] as the topics. I think the alt-text you saw was (perhaps confusingly!) simply placeholder text (alt="Visual description of figure image"). I made some changes to the file to suggest instead:
Figure 3 = "Screen capture of the Keys and Endpoint tab in the Azure Portal" Figure 7 = "Picture of a handwritten diary entry"
I've also suggested the avatar_alt (visual description of lesson avatar) as "Copyright design drawing shows the Youths progressive recorder, a mechanical handwriting copying machine." If you're happy with those three descriptions, we can just keep them in! Thanks again, Charlotte
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
Hello @hawc2 ,
This lesson's sustainability + accessibility checks are now complete.
author(s) bio for ph_authors.yml
name: Jeff Blackadar orcid: 0000-0002-8160-0942 team: false bio: en: | Jeff Blackadar has a Master of Arts in History with a specialization in Data Science from Carleton University.
.md file: /en/drafts/originals/transcribing-handwritten-text-with-python-and-azure.md
images: /images/transcribing-handwritten-text-with-python-and-azure
assets: /assets/transcribing-handwritten-text-with-python-and-azure
original avatar: /gallery/originals/transcribing-handwritten-text-with-python-and-azure
gallery avatar: /gallery/transcribing-handwritten-text-with-python-and-azure
Promotion:
Thanks @jeffblackadar for your thorough edits!
@charlottejmc we should be ready to move forward with publication.
Very exciting to hear! Thanks for this great news! Jeff
On Wed, Nov 29, 2023 at 10:00 AM Alex Wermer-Colan @.***> wrote:
Thanks @jeffblackadar https://github.com/jeffblackadar for your thorough edits!
@charlottejmc https://github.com/charlottejmc we should be ready to move forward with publication.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1832061792, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET7WGUSVFFOS5MLL6STYG5EZHAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSGA3DCNZZGI . You are receiving this because you were mentioned.Message ID: @.***>
Hello Programming Historian, May in inquire if there is anything else you were looking for from me please? I am not sure if you may be waiting on anything. Thanks Jeff
On Wed, Nov 29, 2023 at 11:16 AM Jeff Blackadar @.***> wrote:
Very exciting to hear! Thanks for this great news! Jeff
On Wed, Nov 29, 2023 at 10:00 AM Alex Wermer-Colan < @.***> wrote:
Thanks @jeffblackadar https://github.com/jeffblackadar for your thorough edits!
@charlottejmc https://github.com/charlottejmc we should be ready to move forward with publication.
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1832061792, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET7WGUSVFFOS5MLL6STYG5EZHAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSGA3DCNZZGI . You are receiving this because you were mentioned.Message ID: @.***>
Hello @jeffblackadar.
We are publishing your lesson today! I'll update you as soon as the DOI is live.
Very best, Anisa
Transcribing Handwritten Text with Python and Microsoft Azure Computer Vision is published! 🎉
Congratulations @jeffblackadar! Thank you all for your contributions
Our suggested citation for this lesson is:
Jeff Blackadar, "Transcribing Handwritten Text with Python and Microsoft Azure Computer Vision," Programming Historian 12 (2023), https://doi.org/10.46430/phen0114.
We appreciate your help to circulate our social media announcements about this lesson among your networks: Twitter/X: https://twitter.com/ProgHist/status/1733069313219699067 Mastodon: https://hcommons.social/@proghist/111544298504241295
I'd be also grateful if you might consider supporting our efforts to grow Programming Historian's community of Institutional Partners. This is a network of organisations across Europe, Canada, North America and Latin America who have invested in our success by contributing an annual membership fee in lieu of subscription.
Institutional Partnerships enable us to keep developing our model of sustainable, open-access publishing, and empower us to continue creating peer-reviewed, multilingual lessons for digital humanists around the globe.
If you think that supporting Diamond Open Access initiatives may be among the strategic priorities of the university or library where you work, please let me know.
You can email me <admin [@] programminghistorian.org>, and I can send you an information pack to share with your colleagues. Alternatively, feel free to put me in touch with the person or department you think would be best-placed to discuss this opportunity.
Sincere thanks, Anisa
Thanks Anisa and yes a huge thank you to everyone who worked on this lesson! This is a shared project that is complete thanks to your effort. (And would not be done without it.) Giulia Taurino, Maria Dermentzi, Megan S. Kane, Sarah Melton, Alex Wermer-Colan, Charlotte Chevrie and Anisa Hawes, I appreciate working with you very much. It means a lot to me to be able to see this published. I found this to be a very collaborative experience and I gained a rare learning opportunity too. Best wishes to you and PH! Jeff
On Fri, Dec 8, 2023 at 5:36 AM Anisa Hawes @.***> wrote:
Transcribing Handwritten Text with Python and Microsoft Azure Computer Vision https://programminghistorian.org/en/lessons/transcribing-handwritten-text-with-python-and-azure is published! 🎉
Congratulations @jeffblackadar https://github.com/jeffblackadar! Thank you all for your contributions
Our suggested citation for this lesson is:
Jeff Blackadar, "Transcribing Handwritten Text with Python and Microsoft Azure Computer Vision," Programming Historian 12 (2023), https://doi.org/10.46430/phen0114.
We appreciate your help to circulate our social media announcements about this lesson among your networks: Twitter/X: https://twitter.com/ProgHist/status/1733069313219699067 Mastodon: @.***/111544298504241295
I'd be also grateful if you might consider supporting our efforts to grow Programming Historian's community of Institutional Partners https://programminghistorian.org/en/ipp. This is a network of organisations https://programminghistorian.org/en/supporters across Europe, Canada, North America and Latin America who have invested in our success by contributing an annual membership fee in lieu of subscription.
Institutional Partnerships enable us to keep developing our model of sustainable, open-access publishing, and empower us to continue creating peer-reviewed, multilingual lessons for digital humanists around the globe.
If you think that supporting Open Access initiatives may be among the strategic priorities of the university or library where you work, please let me know.
You can email me <admin [@] programminghistorian.org>, and I can send you an information pack to share with your colleagues. Alternatively, feel free to put me in touch with the person or department you think would be best-placed to discuss this opportunity.
Sincere thanks, Anisa
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/511#issuecomment-1846935276, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEUET7RHJRIZPTNB7SWMGDYILURPAVCNFSM6AAAAAAQ66PUCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBWHEZTKMRXGY . You are receiving this because you were mentioned.Message ID: @.***>
Thank you for these kind words, @jeffblackadar. It makes us very proud to hear feedback like this.
We are grateful for your participation. Your second Programming Historian lesson! The first has also been translated into Portuguese – so these resources are reaching and benefiting a broad community of learners.
The Programming Historian has received the following tutorial on 'Transcribe Handwritten Text with Python and Microsoft Azure Computer Vision' by @jeffblackadar. This lesson is now under review and can be read at:
http://programminghistorian.github.io/ph-submissions/en/drafts/originals/transcribing-handwritten-text-with-python-and-azure
Please feel free to use the line numbers provided on the preview if that helps with anchoring your comments, although you can structure your review as you see fit.
I will act as editor for the review process. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.
Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.
I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me.
Our dedicated Ombudsperson is (Ian Milligan - http://programminghistorian.org/en/project-team). Please feel free to contact him at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudsperson will have no impact on the outcome of any peer review.
Anti-Harassment Policy
This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.