Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 1

nabsiddiqui commented 3 years ago

The Programming Historian has received the following tutorial “Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 1” by @davanstrien. This lesson, which is in two separate parts, is now under review. This ticket is only for Part 1 which can be read here:

http://programminghistorian.github.io/ph-submissions/en/drafts/originals/computer-vision-deep-learning-pt1

Please feel free to use the line numbers provided on the preview if that helps with anchoring your comments, although you can structure your review as you see fit.

I will act as editor for the review process. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.

Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.

I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me.

Our dedicated Ombudsperson is (Ian Milligan - http://programminghistorian.org/en/project-team). Please feel free to contact him at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudsperson will have no impact on the outcome of any peer review.

Anti-Harassment Policy

This is a statement of the Programming Historian’s principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. Thank you for helping us to create a safe space.

nabsiddiqui commented 3 years ago

This is a good tutorial @davanstrien and makes a very complicated process digestible to readers new to it. I did not have a lot of concerns about the lesson when it got started. The beginning had some small issues.

I understand that a broad level overview is provided before jumping into the nitty gritty, but I think the tutorial would be better if each individual concept is first explained and then the code for those concepts is shown below them. For instance, the code for loading the data would be preceded by a section about the type of data being trained and how it is being trained, the code on creating the model would have the explanation preceding it, etc. It seems the code section and the explanation section are already tied together given the crossover of subheadings so it makes more sense that the code would follow the explanation rather than relying on the reader to jump back and forth or make connections.

Other small issues:

[x] Section “Introduction” p. 3—“provide” should say “provides”
[ ] Section “Introduction” p.4 and 5—Makes more sense to place these near the end of the lesson as a section on limitations than at the beginning
[x] Section “Suggested Prior Skills” bullet 2—Should read “Introduction to Jupyter Notebooks” rather than “introduction to Jupyter Notebooks”
[x] Section “Lesson Setup” p.2—The word “simple” is used too repetitively
[x] Section “Lesson Structure”—bullet points about why the lesson is the way it is seem unnecessary. Suggestion is to remove them and the previous sentence introducing them.
[x] Section “A Quick Introduction to Machine Learning” p.5—Don’t think this final paragraph is needed
[x] Section “Creating our Classifier” p.1 and 2—The information about what frameworks are is unnecessary given that this lesson requires previous knowledge of Python. I would just begin with the next section.
[x] Section “Creating an Image Classifier in fastai” p.4—The stuff about star imports best practices could be an endnote.

davanstrien commented 3 years ago

@nabsiddiqui thanks so much for these initial suggestions and for editing this lesson. I will wait to hear back from other reviewers before making your suggested changes.

davanstrien commented 3 years ago

Thanks again @nabsiddiqui - I've now made all of the smaller fixes you've suggested. I just wanted to clarify I've properly understood your broader suggestion about the structure of the lesson before I make more changes, as it seems to entail some fairly substantial re-working.

I think the tutorial would be better if each individual concept is first explained and then the code for those concepts is shown below them.

Do you mean here that you think the section training-an-image-classification-model

should be merged into the computer-vision-using-deep-learning section, by combining the equivalent subsections within each - or am I misunderstanding?

nabsiddiqui commented 3 years ago

@davanstrien Sorry for the delay. I think it would help but it isn't a major issue. I think the tutorial can go as it is. Let me know when you are ready to go to the next step.

davanstrien commented 3 years ago

@davanstrien Sorry for the delay. I think it would help but it isn't a major issue. I think the tutorial can go as it is. Let me know when you are ready to go to the next step.

That's great, thanks 🙂. I will move to implement the changes to part 2 and will flag here if I think any of those changes require more changes here.

davanstrien commented 3 years ago

Sorry meant to cross-post here to say I've finished with the changes in #343. Let me know if there is anything else I should do for this part.

davanstrien commented 3 years ago

@nabsiddiqui just wanted to check in to see if you were waiting for anything else from me or if I could do anything else at my end to help progress this lesson?

nabsiddiqui commented 3 years ago

@davanstrien Sorry for the delay. We had a hard time finding reviewers. We have one reviewer that has agreed to work with us and are still in the process of finding another.

davanstrien commented 3 years ago

@davanstrien Sorry for the delay. We had a hard time finding reviewers. We have one reviewer that has agreed to work with us and are still in the process of finding another.

No problem at all, just wanted to check I wasn't supposed to do something still.

hawc2 commented 3 years ago

@davanstrien the Kaggle notebooks you link throw a 404 error. Can you update the link in your lesson to a working version of the Kaggle notebooks?

davanstrien commented 3 years ago

@davanstrien the Kaggle notebooks you link throw a 404 error. Can you update the link in your lesson to a working version of the Kaggle notebooks?

The Kaggle notebooks are currently private because I wanted to make them public once the lesson is ready. If you could share the Kaggle username of anyone who needs access I can give them access. If you prefer I can make the notebooks public but I'll add a disclaimer then that it's WIP.

hawc2 commented 3 years ago

@davanstrien Nabeel will email you the Kaggle accounts for both reviewers so they can have access. @mblack884 and @cderose have generously agreed to review both Part 1 and 2 of your lesson.

We expect the review process to be complete like October-November, at which point we will synthesize their feedback and give you directions for revision.

Please let us know if you have any questions in the meantime.

davanstrien commented 3 years ago

@mblack884 and @cderose have generously agreed to review both Part 1 and 2 of your lesson.

Thanks so much for offering to review 😀 I think you should both have access to the Kaggle notebooks/data but please let me know if you have issues with accessing anything needed.

cderose commented 3 years ago

@davanstrien, thank you for creating the lesson and giving me early access to it! It'll be a great addition to The Programming Historian. Please find my suggestions for Part 1 below (some of the feedback might change slightly after I finish Part 2, but I thought I'd post what I have in the meantime). I'd be happy to expand on or clarify any of my notes below.

Main feedback One of the most important contributions of this lesson is that it touches on why we should care about computer vision techniques in addition to how we can go about applying them responsibly. For example, p31 contains really terrific, concrete examples for the ways that an image classifier can directly feed into humanities research. Similarly, p33 breaks down why fastai was chosen as opposed to a different library. Paragraphs like p5 acknowledge that these models and the results we get from them, while useful, are flawed, which is why it's great that the lesson repeatedly highlights what to consider at each stage when building (or repurposing) a classifier.

The concepts are also clearly explained throughout, and it's helpful that the introduction to the lesson contains suggested prior skills as a way of setting expectations.

I was also glad to try out Kaggle. I've worked with locally hosted notebooks and Google Colab, but I hadn't tried notebooks in Kaggle before. In general, it worked really well, and it's excellent that Kaggle provides (mostly) free GPU access since it's unclear if Colab is going to continue to do that now that there's Colab Pro. However, Kaggle did require that I provide a phone number in order to get access to the GPU the first time I ran the notebook—the requirement to share personal data/have a Kaggle account in order to access the GPU could be a reason to switch to something else, or at least to spell out the tradeoffs more specifically regarding time estimates in case participants want to run things on the CPU instead. One additional Kaggle point you may address for readers who want to add their own datasets to Kaggle is whether uploaded datasets that you run in the notebook are private or not.

My main suggestion for revision is to reorganize the content so that there are fewer "we'll explain this later" sentences. @nabsiddiqui's suggestion of interweaving the related concept and code snippets together so readers don't have to jump back and forth would help with that. Since there isn't a lot of code anyway in Part 1, another way you could approach the lesson would be to turn Part 1 into a purely conceptual notebook and then have Part 2 as the more coding-intensive notebook. This change might also be helpful for readers with different experience levels and aims (readers who only want a high-level overview can look at Part 1, readers who already know in theory how things work can jump right into Part 2, and readers who need a foundation but also want to build a classifier can work through both parts). Either reorganization would also keep Kaggle from asking readers if they're still there while they're reading the lesson (Kaggle asked me a few times since there's currently a long gap between coding sections).

Minor edits/thoughts p3 -> change "provide" to "provides historians"

use "in" instead of "on" in: "Next to identifiying things on images"

p5 -> change "sheds" to "computer vision techniques shed"

for the sentence: "historians should be conscious of the fact that computer vision techniques shed light on certain parts of visual corpora, but might overlook, or even obscure, other parts", I would recommend adding "misidentify" and "misclassify" to the list as well to underscore that omission isn't the only issue with computer vision models. I'm thinking about examples I've seen where convolutional neural networks apply incorrect (and problematic) labels to cultural heritage materials.

p6 -> since this section is speaking of pluraized lessons, include a link to Part 2. It could be helpful to have a separate lesson aim section for each part, maybe with a shared "this is what you'll get out of taking them both together" sentence. You might repurpose or borrow from the Part One Conclusion (p93), which clearly frames the two parts.

p7 -> remove "though these lessons don’t try to ‘hide’ anything, they also won’t cover all topics in full detail." With that clause, the double emphasis on all topics not being thoroughly covered (which I take as a given since the field of computer vision is much larger than can be covered in two lessons) has the reverse effect of making me wonder what's being left out.

p10 -> titlecase: "Programming Historian"

p11 -> change "a couple" to "a few reasons"

pluralize: "Central Processing Units"

p12 -> depending on how you reorganize things, you might move the Kaggle section to just before the section "Creating an Image Classifier in fastai" (p35) since that's when we start using Kaggle

p13 -> #3, I didn't have an "Edit" button option; instead, I had to click "New Notebook". Since there's only one cell in the notebook (it looks like Kaggle offers some boilerplate code), you might add a cell in the notebook with a comment that confirms readers are in the right place. At first I thought "New Notebook" just created a blank notebook, but I was able to confirm it was the lesson's notebook by looking in "Data" and seeing the computer vision dataset.

You might also mention something about how saving works with notebooks in Kaggle; one place for that could be to extend the sentence for #3: "This will create a copy of the notebook that you can run and edit, where any changes you make will be saved, enabling you to return to them later [if that's how notebooks in Kaggle work]."

p13 -> for #4, change "is selected as to ‘GPU’" to "is set to ‘GPU’"

as I mentioned above, I had to provide a phone number to verify my Kaggle account before it would let me access the expanded options with the GPU; if you stick with Kaggle for the lesson, here's where you might offer time estimates for how long it would take to run things on the CPU instead

p14 -> the numbering reset - were these meant to be #5 and $6? You could make them into paragraphs and remove the numbering all together. For the reminder to close the session once finished (#2), are there any repercussions for just x'ing out of the browser versus formally closing the session? If so, you might mention them.

p17 -> add commas to the second sentence: "This algorithm would, over repeated expoure to examples, 'learn' patterns"

you might include a link to Wikipedia on unsupervised learning like you did for supervised learning, for anyone who wants to compare approaches

p19 -> add "that": "Now that we have got"

p20 -> split into two paragraphs: "This is a dataset of extracted visual content for 16,358,041 digitised historic newspaper pages drawn from the Library of Congress Chronicling America collection. Images have been placed into one of seven categories, including photographs and advertisements." Side question - did humans do the placing or a computer? In the next paragraph it sounds like the model did the placing, but to make it explicit here, you could rephase along the lines of: "A computer vision model placed images into one of seven categories..."

p22 -> change comma to semicolon: "it will contain errors; for now, we will accept"

p38 -> you might explicitly tell readers to add the code to the Kaggle notebook and run the cell(s). You may also add a sentence that says for each code block in the tutorial, create a new cell in the Kaggle notebook, if you'd recommend adding each block as separate cells. In the Python classes I've taught where we used Jupyter notebooks, I've gotten the question about when to add a new cell; for this tutorial, it seems to make sense to have a separate cell for each code block, which might help with debugging if someone runs into a snag. Also, can we ignore the numpy and pandas cell Kaggle pre-loaded by default? (I skipped over that cell, but readers who might not know what that cell is doing may need more explicit guidance.)

p41 -> the filepath in the code block returned an error when I ran it; here's the filepath I used that worked: "../input/computer-vision-for-the-humanities-ph/kaggle/ads_data"

p44 -> might make a quick note that the exact ads returned will be different from the included example, but that the important thing is, as you say, "that the labels have been associated correctly..."

p49 - change comma to semicolon: "information about the model; this includes our tracked metric"

side note: my score was significantly lower (72%); this paragraph could be a good place to mention ways of improving model accuracy (ex, more epochs) while also cautioning against overfitting (those sections currently come later, but they could be more meaningful following this example).

p50 -> take out "to do" in: "the aim here was not to show the best solution with this particular dataset but to give a sense of what is possible to do with a limited number of labeled examples."

p52 -> add hyphen: "a high-level illustration"

p56 -> add parentheses and fix typo after the example: "They could be used directly for making decisions ( for example, automating where images are displayed in a web collection), but oftentimes, the predictions will be fed back to a human for further analysis"

p57 -> missing image (404 page)

p62 -> briefly mention some of the factors that affect how much data is needed (or point to resources online that will) to help readers who are interested in assembling their own dataset

p64 -> this paragraph could be a good place to mention holding onto some data for validation purposes, too

p67: fix typo by rephrasing: "we focus on a specific type of deep learning that uses a 'Convolutional Neural Network' (CNN)."

p72: fix typos: "trained to predict bounding boxes for a number of different types of objects in an image" and "This can impact the performance of this model on your data"

You point out several important limitations to using pre-trained models for historical data. For balance, you might also incude a couple pros for people who are wondering whether a pre-trained model might work for them (especially since you go on to recommend starting with a pre-trained model).

p81: singularize "translate": "that the weights a model is learning on the training data also translate to new data"

change commma to semicolon: "the validation data is only used to 'test' the predictions of the model; it is not directly used by the model to update the weights"

p84: fix typo: "to try to validate whether a various technique helps" add hyphen: "19th-century newspaper adverts" change comma to semicolon: "flag set to False; this flag tells fastai"

p85: add "that" and a comma: "Now that we have created a new learner, we'll use"

davanstrien commented 2 years ago

Thank you so much for your review and suggestions so far - these are really useful. I will probably work backwards and start with the "Minor edits/thoughts" corrections and then come back to the more substantive corrections changes once I have done that. I will also have a think about how to restructure the order to address the comments made by you and @nabsiddiqui.

davanstrien commented 2 years ago

@nabsiddiqui @cderose @mblack884 I've blocked some time out next week to work on the changes suggested so far but just wanted to check in about part 2 reviews. If they will be ready soon I can hold off making changes but otherwise, I will proceed with making the changes suggested to part 1 so far.

mblack884 commented 2 years ago

I should have my review of Part 1 completed by the end of this week. My apologies for the delay. It has been a busier than usual semester.

davanstrien commented 2 years ago

@mblack884 no problem for the delay, and many thanks for doing the review. I will hold off with making changes for now.

mblack884 commented 2 years ago

First, thank you for taking up such a complex topic and working to create an inviting introduction to it. I think that with some revision, this lesson will both help demystify deep learning for people who are interested in thinking about how the systems built on these techniques are influencing our relationship to digital culture and give those interested in trying it out a good methodical model to follow. I’ve divided my suggestions into technical & conceptual suggestions, as well as a list of more paragraph-specific issues to consider.

Suggestions about Technical Content It seems to me like this lesson is intended to be “plug-and-play” in the sense that users should be able to setup a Kaggle account, access the lesson directory, and load a notebook that lets them get started quickly. If that’s your goal, then you should emphasize the quick start-up in the lesson text and make sure that Kaggle environment provides them with sample, working code to get started with.

As you note in the lesson, many readers are likely to skim through quickly to look for setup instructions. I certainly fell into this category and it took me a few minutes to confirm that I could actually do everything in the cloud. I think it would help readers quite a bit to signal directly and explicitly that this lesson can be followed without any local setup. Not only would this help people looking to figure out setup get their questions answered quickly, it would likely also make the lesson more inviting for people new to computer vision (or relatively new to ML in Python). A simple fix would be to amend the Kaggle subheading just before P12 to signal more directly that this is the “cloud setup” instruction (Cloud Setup with Kaggle or something similar)

Once I got into Kaggle, I ran into other problems. Based on the lesson as written, I initially assumed that the notebooks had been removed from the project because there were none to select. Maybe they aren’t shared correctly? Regardless, there was no “Edit” button for me. I did get access to a notebook environment by clicking “New Notebook,” but after that I found the lesson hard to follow because it appeared written under the assumption that users would have some sample code prepared for them to work with. I had to copy and paste code over in piecemeal from the lesson to proceed into the practical side of it. Additionally, there was also no “Accelerator” option for me in the new notebook that I created.

While copying and pasting was not a major problem, the code that I transferred from the lesson into the notebook would not work without modification. Like the other reviewer, I was able to fix some simple issues and get parts of it to work. I was able to follow the lesson up to P44 this way. After that, I started running into errors that I did not have time to sit down and fix. Because of these issues, I skimmed over the rest of instructions.

I’d be happy to take another look at the practical parts of the lesson sometime in the next few weeks if you choose to rework this aspect of the lesson. Having access code to review alongside a conceptual description of the process has been important to my own self-study in programming, and I think that this lesson will similarly help many learn to incorporate computer vision into their research.

Suggestions about Conceptual Content There are some organization issues that I found confusing as a reader. The lesson begins with some straight-forward instructions, some good background information, and then some initial experiments applying concepts from the background info.

At the same time, there are also moments when you talk a lot about general methodological concerns, and it is not clear whether these are concepts/questions that readers should be considering when following your instructions for the lesson. In P54-56, for example, you talk about how the data present here likely won’t be useful for readers, and they’ll have to create some data themselves. While that is true beyond the context of this specific lesson, I could see some readers getting confused here (“Do I need to stop now and go find some data?”). I would move considerations about how to plan a similar case study to end. A good way to handle those kind of comments would be a section before the conclusion on steps for addressing your own question using computer vision / deep learning. It could also serve as a good, quick review of the workflow they just followed.

Generally, I would suggest that once you start providing practical instruction, you should maintain a focus on walking readers through the case study you’ve prepared. It is ok to weave in some conceptual discussion (to explain what students are doing), but getting too deep into that discussion will make it hard to understand the practical side of designing/managing a workflow. Admittedly, my ability to follow some of the discussion in the latter half of the lesson may have been due to the issues I noted above getting the lesson to work in Kaggle.

Other Minor Suggestions Introduction should include some definition of deep learning or its relationship to computer vision & machine learning. You do well defining it later, but because its included in the “aims” for the lessons, a brief summary in the introduction is needed. Without it, I think there is a risk that readers who are new to these concepts might assume that deep learning and machine learning can be used interchangeably.

P1: “Although” used to begin two consequence sentences. Change one to “While” or “Even though”?

P2: I’d suggest moving the link to Natural Language Processing to P1, where the concept is first introduced.

P3: Change “provide” to “provides”

P5: Avoid using an acronym before providing the full name. “ML” in first sentence should be “machine learning (ML).” Might need to rephrase that phrase to “fairness in machine learning (ML)” so the parenthetical doesn’t break up the concept.

P7: “These lessons don’t aim to:”

P14: The link URL to the Kaggle site is different than the URL in the text (link goes to the github repo).

P15: Numbering in list restarts at 1 after the figure

P22: Break the final sentence up to improve readability. I’d suggest: “Since the data from Newspaper Navigator is predicted by a machine learning model, it will contain errors. For now, we will accept that the data we are working with is imperfect.” Might also add a short “because…” clause here to explain why. If I were new to this area, I’d wonder why errors are acceptable.

P32: Unless you need to explain why you chose the fastai in the opening sentence, you could cut the first sentence and open with “fastai is a Python library…”. If you do, move hyperlink to the new first sentence.

P58: “The deep learning training loop” image did not load for me

davanstrien commented 2 years ago

@cderose turning you small edits into a todo list below so I can track more easily.

[x] p3 -> change "provide" to "provides historians"
[x] use "in" instead of "on" in: "Next to identifiying things on images"
[x] p5 -> change "sheds" to "computer vision techniques shed"
[x] for the sentence: "historians should be conscious of the fact that computer vision techniques shed light on certain parts of visual corpora, but might overlook, or even obscure, other parts", I would recommend adding "misidentify" and "misclassify" to the list as well to underscore that omission isn't the only issue with computer vision models. I'm thinking about examples I've seen where convolutional neural networks apply incorrect (and problematic) labels to cultural heritage materials.
[ ] p6 -> since this section is speaking of pluraized lessons, include a link to Part 2. It could be helpful to have a separate lesson aim section for each part, maybe with a shared "this is what you'll get out of taking them both together" sentence. You might repurpose or borrow from the Part One Conclusion (p93), which clearly frames the two parts.
[x] p7 -> remove "though these lessons don’t try to ‘hide’ anything, they also won’t cover all topics in full detail." With that clause, the double emphasis on all topics not being thoroughly covered (which I take as a given since the field of computer vision is much larger than can be covered in two lessons) has the reverse effect of making me wonder what's being left out.
[x] p10 -> titlecase: "Programming Historian"
[x] p11 -> change "a couple" to "a few reasons"
[x] - pluralize: "Central Processing Units"
[x] p12 -> depending on how you reorganize things, you might move the Kaggle section to just before the section "Creating an Image Classifier in fastai" (p35) since that's when we start using Kaggle
[ ] p13 -> didn't have an "Edit" button option; instead, I had to click "New Notebook". Since there's only one cell in the notebook (it looks like Kaggle offers some boilerplate code), you might add a cell in the notebook with a comment that confirms readers are in the right place. At first I thought "New Notebook" just created a blank notebook, but I was able to confirm it was the lesson's notebook by looking in "Data" and seeing the computer vision dataset.
[ ] You might also mention something about how saving works with notebooks in Kaggle; one place for that could be to extend the sentence for 3: "This will create a copy of the notebook that you can run and edit, where any changes you make will be saved, enabling you to return to them later [if that's how notebooks in Kaggle work]."
[x] p13 -> for # 4, change "is selected as to ‘GPU’" to "is set to ‘GPU’"
[ ] as I mentioned above, I had to provide a phone number to verify my Kaggle account before it would let me access the expanded options with the GPU; if you stick with Kaggle for the lesson, here's where you might offer time estimates for how long it would take to run things on the CPU instead
[ ] p14 -> the numbering reset - were these meant to be # 5 and $6? You could make them into paragraphs and remove the numbering all together. For the reminder to close the session once finished (# 2), are there any repercussions for just x'ing out of the browser versus formally closing the session? If so, you might mention them.
[x] p17 -> add commas to the second sentence: "This algorithm would, over repeated expoure to examples, 'learn' patterns"
[x] you might include a link to Wikipedia on unsupervised learning like you did for supervised learning, for anyone who wants to compare approaches
[x] p19 -> add "that": "Now that we have got"
[x] p20 -> split into two paragraphs: "This is a dataset of extracted visual content for 16,358,041 digitised historic newspaper pages drawn from the Library of Congress Chronicling America collection. Images have been placed into one of seven categories, including photographs and advertisements." Side question - did humans do the placing or a computer? In the next paragraph it sounds like the model did the placing, but to make it explicit here, you could rephase along the lines of: "A computer vision model placed images into one of seven categories..."
[x] p22 -> change comma to semicolon: "it will contain errors; for now, we will accept"
[ ] p38 -> you might explicitly tell readers to add the code to the Kaggle notebook and run the cell(s). You may also add a sentence that says for each code block in the tutorial, create a new cell in the Kaggle notebook, if you'd recommend adding each block as separate cells. In the Python classes I've taught where we used Jupyter notebooks, I've gotten the question about when to add a new cell; for this tutorial, it seems to make sense to have a separate cell for each code block, which might help with debugging if someone runs into a snag. Also, can we ignore the numpy and pandas cell Kaggle pre-loaded by default? (I skipped over that cell, but readers who might not know what that cell is doing may need more explicit guidance.)
[x] p41 -> the filepath in the code block returned an error when I ran it; here's the filepath I used that worked: "../input/computer-vision-for-the-humanities-ph/kaggle/ads_data"
[x] p44 -> might make a quick note that the exact ads returned will be different from the included example, but that the important thing is, as you say, "that the labels have been associated correctly..."
[x] p49 - change comma to semicolon: "information about the model; this includes our tracked metric"
[ ] - side note: my score was significantly lower (72%); this paragraph could be a good place to mention ways of improving model accuracy (ex, more epochs) while also cautioning against overfitting (those sections currently come later, but they could be more meaningful following this example).
[x] p50 -> take out "to do" in: "the aim here was not to show the best solution with this particular dataset but to give a sense of what is possible to do with a limited number of labeled examples."
[x] p52 -> add hyphen: "a high-level illustration"
[x] p56 -> add parentheses and fix typo after the example: "They could be used directly for making decisions ( for example, automating where images are displayed in a web collection), but oftentimes, the predictions will be fed back to a human for further analysis"
[x] p57 -> missing image (404 page)
[ ] p62 -> briefly mention some of the factors that affect how much data is needed (or point to resources online that will) to help readers who are interested in assembling their own dataset
[ ] p64 -> this paragraph could be a good place to mention holding onto some data for validation purposes, too
[x] p67: fix typo by rephrasing: "we focus on a specific type of deep learning that uses a 'Convolutional Neural Network' (CNN)."
[x] p72: fix typos: "trained to predict bounding boxes for a number of different types of objects in an image" and "This can impact the performance of this model on your data"
[ ] You point out several important limitations to using pre-trained models for historical data. For balance, you might also incude a couple pros for people who are wondering whether a pre-trained model might work for them (especially since you go on to recommend starting with a pre-trained model).
[x] p81: singularize "translate": "that the weights a model is learning on the training data also translate to new data"
[x] change commma to semicolon: "the validation data is only used to 'test' the predictions of the model; it is not directly used by the model to update the weights"
[x] p84: fix typo: "to try to validate whether a various technique helps" add hyphen: "19th-century newspaper adverts" change comma to semicolon: "flag set to False; this flag tells fastai"
[x] p85: add "that" and a comma: "Now that we have created a new learner, we'll use"

davanstrien commented 2 years ago

Improvements to make around running lesson code

[ ] Make Kaggle setup clear
- [ ] how to run notebooks in kaggle (make public test of kaggle setup)
- [ ] notes about working with kaggle (TODO add list here)
- [ ] how to save notebooks
- [ ] how to edit
- [ ] find link to kaggle intro?
[ ] General guidance for running through lesson
- [x] make it clearer that code can be run with no setup
- [ ] make it clearer that a notebook should be available to run through the code
- [x] suggest reading lesson on PH site initialy and then running through code as a seperate step (to help with kaggle bugging you to see if you are still around)

davanstrien commented 2 years ago

Edit todos

[x] P1: “Although” used to begin two consequence sentences. Change one to “While” or “Even though”?
[x] P2: I’d suggest moving the link to Natural Language Processing to P1, where the concept is first introduced.
[x] P3: Change “provide” to “provides”
[x] P5: Avoid using an acronym before providing the full name. “ML” in first sentence should be “machine learning (ML).” Might need to rephrase that phrase to “fairness in machine learning (ML)” so the parenthetical doesn’t break up the concept.
[x] P7: “These lessons don’t aim to:”
[x] P14: The link URL to the Kaggle site is different than the URL in the text (link goes to the github repo).
[x] P15: Numbering in list restarts at 1 after the figure
[x] P22: Break the final sentence up to improve readability. I’d suggest: “Since the data from Newspaper Navigator is predicted by a machine learning model, it will contain errors. For now, we will accept that the data we are working with is imperfect.” Might also add a short “because…” clause here to explain why. If I were new to this area, I’d wonder why errors are acceptable.
[x] P32: Unless you need to explain why you chose the fastai in the opening sentence, you could cut the first sentence and open with “fastai is a Python library…”. If you do, move hyperlink to the new first sentence.
[x] P58: “The deep learning training loop” image did not load for me

davanstrien commented 2 years ago

@cderose @mblack884 it seems that Kaggle permissions don't work how I thought they do and you didn't have full access to the notebooks. I've hopefully fixed this now. For the published lesson this should be less complicated. I have made most of the small fixes you both suggested. I will spend some time thinking about how the lesson structure could be made to flow a little bit easier between concepts and code. I will hopefully get those changes done this week or early next week.

davanstrien commented 2 years ago

Hi @cderose @mblack884 @nabsiddiqui

Firstly, appologies for not getting to this before the end of last year.

I have now hopefully made fixes for most of your minor changes. A few I have left since they deal with Kaggle issues which I hopefully have addressed separately.

I am still hesitant to move all of the theory to the top of the lesson and only cover the code at the end. However, I realise that this caused some jumping around between the two types of material. My proposed solution for this is the following:

I have restructured the lesson to ensure there is only one switch between code and theory/conceptual discussion. The lesson now has:
- bit of an intro
- training the first illustrated/text-only model
- explanation of what is going on with the code and the ml workflow.
I have moved the 'experiment' with transfer learning to an appendix - the surrounding discussion remains. Still, I think the experiment can be left as an activity for those keen to fiddle around more.
I have modified the setup instructions to suggest that readers first read through all of the material on the PH website and then go through the content in the Kaggle notebook.

On this note, unfortunately, I think I messed up in how I shared the Kaggle notebooks with you both. I think you both ended up having to copy and paste, which wasn't my intention. The idea is that a reader can run everything in the Kaggle notebook without any modifications, and it will work. I have pinned the docker environment, so the code should continue to run without issues in the future. I have now made the Kaggle dataset and notebook public, so it is possible to see how it should look. The notebook can be found here: https://www.kaggle.com/davanstrien/01-progamming-historian-deep-learning-pt1-ipynb

The idea would be that all of the content is the same with the Kaggle notebook offering readers a chance to modify the code, play around etc. The notebooks on Kaggle are currently out of sync with this version of the lesson, but once a final version is approved, I will make sure to sync them back up.

I hope that having removed some of the issues with the Kaggle will make the process a bit smoother. Happy to hear suggestions if you think this restructure etc., doesn't address all of your concerns.

anisa-hawes commented 2 years ago

Hello all,

Please note that this lesson's .md file has been moved to a new location within our Submissions Repository. It is now found here: https://github.com/programminghistorian/ph-submissions/tree/gh-pages/en/drafts/originals

A consequence is that this lesson's preview link has changed. It is now: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/computer-vision-deep-learning-pt1

Please let me know if you encounter any difficulties or have any questions.

Very best, Anisa

hawc2 commented 2 years ago

@davanstrien I'm going to test the code for both lessons soon using the complete Kaggle notebook you created. To not clog up this issue ticket too much, I will probably email you if I run into problems relating specifically to the Kaggle notebook.

Looking at the PH lesson Part 1, the link in Part 1 doesn't go to the Kaggle notebook . Can you fix that link in paragraph 13? As we discussed in #343, a single Kaggle notebook without much commentary will be linked from both lessons.

Can you update the title of this complete Kaggle notebook as well, and then link it in both places so it properly represents the complete pipeline?: https://www.kaggle.com/code/davanstrien/cleaned-01-progamming-historian-deep-learning-pt1/notebook

For both lessons Part 1 and 2, you could also link the notebook more clearly so the reader knows where the link is going. In Part 2, the link is just the word "Kaggle."

davanstrien commented 2 years ago

@davanstrien I'm going to test the code for both lessons soon using the complete Kaggle notebook you created. To not clog up this issue ticket too much, I will probably email you if I run into problems relating specifically to the Kaggle notebook.

thanks!

Looking at the PH lesson Part 1, the link in Part 1 doesn't go to the Kaggle notebook . Can you fix that link in paragraph 13? As we discussed in #343, a single Kaggle notebook without much commentary will be linked from both lessons. I've updated this link now and added an updated link for the second notebook. Once the lesson is live I will also link back from the Kaggle notebooks to the PH website.

davanstrien commented 2 years ago

Documenting outcome email exchange between @hawc2 and @davanstrien

Code notebook for the two parts of the lesson put into a single notebook: https://www.kaggle.com/code/davanstrien/ph-computer-vision-tutorial-part-1-and-2
I have added back in some more commentary for the notebooks alongside the code. The narrative lessons of the lesson are not included in the notebooks. I also think it's preferable to allow people to interact with the material through a single display as much as possible i.e. avoid switching windows between code/explanation.
I have also created a new GitHub repository for the lesson here: https://github.com/davanstrien/Computer-Vision-for-the-Humanities-an-introduction-to-deep-learning-for-image-classification which includes links to the data used and also provides a version of the notebook that will run on Google Collab.
My intention is to keep the lesson's code and notebooks in sync as much as possible i.e. not update the GitHub notebooks without changing the lesson code. The exception is for breaking changes. The Kaggle environment is pinned so should be less likely to break.

hawc2 commented 2 years ago

Thanks @davanstrien! @nabsiddiqui and I are both ready to sign off on Part 1 and 2 of this lesson and hand it off to @anisa-hawes for copyediting!

anisa-hawes commented 2 years ago

Thank you, all! I am back from Leave today. I have #436 (another two-part lesson) on my desk for this week, so I will start work on copyediting this lesson next week, aiming to deliver ASAP.

davanstrien commented 2 years ago

@anisa-hawes version 2.6.0 https://github.com/fastai/fastai/releases/tag/2.6.0 of fastai introduces a change to one of the functions (cnn_learner) used in the lesson. Nothing in the code breaks but it would be nice to update this at some point. I didn't want to make changes to the lesson without checking in case you were doing copyedits. I can also wait to make that change in a pull request after the lesson is published since everything works without it too.

hawc2 commented 2 years ago

@anisa-hawes, I'm just checking in on this lesson's copy-edits - do you need anything from me or Nabeel at this time? Are you waiting on @davanstrien or vice-versa?

Thanks all. Looking forward to seeing this published in the next few weeks!

davanstrien commented 2 years ago

@anisa-hawes let me know if you need anything from me. I'm happy to make the small edits to the function names above before or after the copyediting -- it should only be a couple of sentences to modify at most.

davanstrien commented 2 years ago

@hawc2 @nabsiddiqui, @anisa-hawes and I have discussed copy-edits via email. The remaining steps from my perspective are:

@anisa-hawes to confirm the formatting used for image alt texts. Once this is confirmed I will add these in (I can commit to turning this around in 1 day).
I believe the lesson should be ready for publication - looking forward to confirming a date and getting this out there!

anisa-hawes commented 2 years ago

Thank you, @davanstrien.

Actions from my side:

[x] Review Daniel's latest commits
[x] Replace external links in the lessons with archival perma.cc links (except in cases where the link directs to an action on the live web, e.g., install software)
[x] Clarify the liquid syntax needed to insert alt text. This is in-progress: see Jekyll Issue #2612
[x] Send Daniel and co-authors Authorial Copyright and Permission to Publish form
[x] Receive x5 Authorial Copyright and Permission to Publish forms

davanstrien commented 2 years ago

thanks, @anisa-hawes I have sent my author form now.

anisa-hawes commented 2 years ago

Dear @davanstrien.

Apologies for the delay! We've implemented some updates to _includes/figure.html which enables any alt tags included within our liquid syntax to be rendered.

The syntax to use is as follows:

{% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Caption text to display" %}

An important note is that Markdown styling should not be included within your alt text, because in our tests we found that screen readers read all the characters directly (so bold was read as asterisks).

Let me know if you have any questions about this, or if I can help you by adding the alt-text in.

davanstrien commented 2 years ago

Dear @davanstrien.

Apologies for the delay! We've implemented some updates to _includes/figure.html which enables any alt tags included within our liquid syntax to be rendered.

The syntax to use is as follows:

{% include figure.html filename="file-name.png" alt="Visual description of figure image" caption="Caption text to display" %}

An important note is that Markdown styling should not be included within your alt text, because in our tests we found that screen readers read all the characters directly (so bold was read as asterisks).

Let me know if you have any questions about this, or if I can help you by adding the alt-text in.

Thanks for confirming. I have just added alt text for the images in both parts of the lesson.

anisa-hawes commented 2 years ago

Thank you, @davanstrien – I've just reviewed your commits and your alt text looks perfect ✨

--

Hello @hawc2 and @nabsiddiqui,

A few final YAML elements are missing, so I'd be grateful if you could complete the following fields (for both parts 1 and 2):

difficulty: TBC
activity: [TBC]
topics: [TBC]
abstract: TBC
avatar_alt: TBD

Please let me know if you have any questions about how to complete these, or if you'd like to post the information here I can add it to the .md files.

Thank you, Anisa

davanstrien commented 2 years ago

difficulty: 3
activity: analyzing
topics: [python]
abstract: This lesson introduces the topic of computer vision. In particular, the lesson provides an overview of how machine learning methods can be used to classify images into different categories. 
avatar_alt:

For topics, I've opened a PR https://github.com/programminghistorian/jekyll/pull/2621 for a machine learning topic but I'm happy to add this at a later stage.

For the image I thought this might work: computer-vision-deep-learning-original

The source for this image is https://www.flickr.com/photos/britishlibrary/11118264784/in/photolist-i8EUjF-i9g2XE-hZmQSn-i6TRmv-i7DGX8-i7GkaN-i7wvaY-ibkNxj-hWu1Lq-i7FYzr-icZEwh-i4ZhZD-i6wSNa-i8TKaT-i7BeeM-i1yE3D-idsXk5-i917XE-icxAB6-i95M3Q-idPyMM-i8Ss2B-i7A97f-hXLH3S-iczncs-i8WUvi-idgRjQ-id6J6t-icAH2A-iboHu7-i8WyS1-i7yjmD-i7pFaD-i7DkUQ-icy6yc-i8zRnC-i17tix-icCi2X-i8dRbX-hSXHhC-i95ATt-i7F2Q8-hZhNQK-i8h18j-i8V2wu-i7tN5q-icRLxk-i5ows2-i1hezb-hXN8kB

If that one works I would suggest this for the avatar_alt: An illustration of a below camera on top of a wooden stand with a dark cloth.

davanstrien commented 2 years ago

@anisa-hawes let me know if you need anything else from my end. I'm happy to send image via email if required

anisa-hawes commented 2 years ago

Thank you, for the image suggestion @davanstrien! We usually leave it up to the editors to select the images, but if @hawc2 and @nabsiddiqui are happy that is fine 🙂 We will need two images + alt text for both - one for Part 1 and another for Part 2 (#343).

@hawc2 and @nabsiddiqui will confirm the remaining YAML fields, and then do their final read-throughs. After that, we'll be able to confirm the anticipated publication date.

nabsiddiqui commented 2 years ago

@anisa-hawes I don't have any particular preference for the image. The one that @davanstrien chose is fine with me.

hawc2 commented 2 years ago

The yaml is all updated, I just need to upload the image assets, but I'll wait to do that until I have images for both lessons. Hopefully we can publish part 1 and 2 in the coming weeks!

davanstrien commented 2 years ago

Abstract for part 1:

This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.

davanstrien commented 2 years ago

I the author hereby grant a non-exclusive license to ProgHist Ltd to allow The Programming Historian English|en français|en español to publish the tutorial in this ticket (including abstract, tables, figures, data, and supplemental material) under a CC-BY license.

hawc2 commented 2 years ago

This lesson, Part 1 and 2, is now published! Congrats everyone! Thanks to our reviewers and the authors for all your work making this a really solid tutorial!

@nabsiddiqui can you add this and part 2 to our twitter bot?

anisa-hawes commented 2 years ago

Hello @nabsiddiqui. It would be great if you could add Tweets (2 for each Part) to our Twitter Bot spreadsheet, so we can periodically publicise these lessons in the future. Instructions for how to do that are here but let me know if you have any questions.

As the lessons are now published, I'm closing this Issue.

programminghistorian / ph-submissions

Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification, Part 1 #342

Anti-Harassment Policy