haesleinhuepf / BioImageAnalysisNotebooks

Python Jupyter notebooks for BioImageAnalysis, GPU-accelerated image processing, bio-image data science and more
https://haesleinhuepf.github.io/BioImageAnalysisNotebooks
BSD 3-Clause "New" or "Revised" License
115 stars 32 forks source link

Intro questions #14

Open guiwitz opened 2 years ago

guiwitz commented 2 years ago

Hi @haesleinhuepf,

I have been looking through the introductory part and have a few questions. I don't want to create a separate issue for each of them or make a PR for each before discussing, so I group them here. If you prefer separate issues let me know. And sorry in advance if some of the points are already explained somewhere else and I just missed them:

Ok, I think that's long enough... Let me know what you think and I'll make PRs if you agree with any of those!

Cheers, Guillaume

haesleinhuepf commented 2 years ago
  • Should 1-2 examples be added with f-strings? I find them extremely useful and now favour them even with beginners over the classic my text + str(a) approach.

Yes! Great idea. The "my text" + str(a) stuff is there because I came from ImageJ island when I started this. I think we should mention int() and str() but I fully agree that there are better ways such as f-strings.

  • the masking numpy arrays seem oddly placed as Numpy hasn't been introduced yet at that point. Maybe it should be moved?

True! This would also fit nicely next to cropping and slicing, or do you have another idea where to move it?

  • there's never an introduction to packages. The first time it's used is when the math module is used, but without much comment. Should there be a short notebook introducing this, and in particular the different variants of import like from xx import yy, import xx etc.

True! I do that in my lecture along the road. But I agree, having a notebook explaining it would be nice.

  • in the custom functions notebook, you explain how to document functions but don't show how to document inputs/outputs (e.g. numpy-docs). I think that's actually quite useful.

Yes, I didn't do that because I didn't want to overwhelm my target audience. Quite some of them have issues understanding the concepts of function and for-loops. Thus, I wanted to keep this part simple. How about adding a separate notebook about advanced docstrings and type-annotations for parameters? E.g. in the advanced python section? It could go well next to the introduction of custom libraries.. I also prefer numpy-style docstrings btw.

  • in the introduction to image processing, there's never a real intro to Numpy and things are added bits by bits and people referred to your other course Bio-image_Analysis_with_Python. Is that intentional?

Yeah, here you hit a knowledge-gap of mine. The link to the course may just be wrong because it basically consists of the same materials. I never introduce numpy as such properly, because I always thought image-processing workflow builders may not need it. They should know how to apply a filter to an image, e.g. using scikit-image. They should maybe not know how it is done under the hood. But I see your point. And I agree, a numpy-intro could be done more properly.

what is an array

You find something here, but there is room for improvement: https://github.com/haesleinhuepf/BioImageAnalysisNotebooks/blob/main/docs/12_image_analysis_basics/01_Introduction_to_image_processing.ipynb

why is it useful (simple operations on all pixels in one line of code without for loops)

The fun part is that I hardly use these simple operations on all pixels in daily practice. And our students are programming-beginners. They never programmed a for-loop which ran over pixels. They learn scikit-image filters as the first and correct way. We teach them entirely things like blurred_image = blur(image) and label_image = segment(blurred_image). In that way, the students do not think too much about what happens to the individual pixels. Conceptionally the image becomes the atomic entity of data. Did you think of a specific notebook from your resources for adding here? I'd be happy to take a look!

what other things one can do with numpy (functions, random generators etc.)

My personal problem with numpy still is that I don't find many functions very intuitive. Many names are not informative from an image-processing perspective (e.g. nditer(), ravel() and reshape()). Thus, also here, I don't teach those in my course (yet) because those are hard to understand conceptually and not necessary for somebody who wants to apply filters to images and segment cells in an image. Also, in our course, we almost entirely work with practical examples and do hardly generate images with artifical noise, or random images. Thus, I'm not sure if those kind of basics should be covered here in this notebook collection in very detail. On the other hand, I'd be happy if you teach me basic numpy-functions you use often when processing microscopy images. If you know cool tricks and useful functions that should be covered, please close that knowledge gap on my side :-) Also here: If you have a specific notebook in mind, please share a link!

You see, I'm very open to most of the points you listed. In very general, I think the notebook collection would profit from another expert (e.g. you) making the notebooks more useful to a broader audience.

Looking forward to your first PR!

Cheers, Robert

guiwitz commented 2 years ago

Great I see that we agree on most points so I'll make a few PRs soon. Just regarding Numpy in general: I also avoid going into complicated functions and just show one example like reshape so that students understand the logic of working with dimensions. My point is rather to make clear that Numpy is something separate from standard Python that allows one to do things that are not possible with lists (If often get the question "why can't I do that with a list"). For example something like this: https://guiwitz.github.io/DAVPy/02-Numpy_arrays.html#numpy-arrays. But I agree with your point that there's no reason in general to explain first why things could be complicated when in fact they aren't! I'll make a notebook suggestion and we can discuss further in the PR.

haesleinhuepf commented 2 years ago

Just regarding Numpy in general: I also avoid going into complicated functions and just show one example like reshape so that students understand the logic of working with dimensions. My point is rather to make clear that Numpy is something separate from standard Python that allows one to do things that are not possible with lists (If often get the question "why can't I do that with a list"). For example something like this: https://guiwitz.github.io/DAVPy/02-Numpy_arrays.html#numpy-arrays.

That's perfect! I remember that my students were confused this year and one asked "Why are python arrays so weird? Shouldn't it work like numpy-arrays?" :-D