[x] First give an overview of the steps that the student will do
Ie which tasks and what they do
[x] Are they using CPU or GPUs? Can they use GPUs?
[x] Mention that we only use a subset of the original data
[x] Explicitly write (in the beginning) what is it that we want to achieve. (Learning objectives)
[x] You sometimes use hmp2 and other times ibd for data and configs. This is confusing
[x] 2. Inspecting the data
[x] 1. How can i inspect the data? Please give directions
[x] 2. In the data folder there are all ibd and hmp2 files. MGX, MBX and MTX. Explain that they are for two different experiments (if they are)?
[x] 3. Indicate when you make people run commands what it will output (e.g. 2 figures below etc)
[x] 4. What are the differences of the two Mutual Information plots? And you only show for hmp2.mbx data whereas the PCC plots are ibd.mbx?
[x] 3. Encoding the data
[x] 1. Be more informative on how the data is encoded. For instance, most people will not know what a "binary bit flag" is. Instead write for instance, 0 and 1.
[x] 2. Before/after normalization. Give more explanation of what you see and what the effect of this is. Why does it help to do the normalization? Also it is not possible to see legend on z-axis
[x] 3. Explain what is meant by shape of the datasets and what the output means e.g. "(283,1 ,2)"
[x] 4. Hyperparameter optimization
[x] 1. Specify which hyperparameters you are testing and why
[x] 2. Indicate where this can be changed
[x] 3. "The output of the previous command is a TSV table called ..." Can you indicate how to inspect it?
[x] 4. The output of the hyperparameter training - why and what does the plots mean? Ie. What is reconstruction accuracy and what do we expect?
[x] 5. Summaries this a bit. Which hyperparameters do we end up using?
[x] 5. Latent space ...
[x] 1. Which architecture is used for training? Is it the architecture from point 4? Also what are the actual numbers?
[x] 2. Indicate expected runtime when expected to be longer than just very quick, e.g. for model training
[x] 3. A bit more explanation of the plots. Can you print out in the output a brief title/explanation of the plots?
[x] 4. Reconstruction plot comes after latent scatter plots in the output, but are described before in the text above
[x] 5. Much more explanation of the individual plots, you must guide the user
[x] 6. Also what does SHAP mean (intuitively, not in mathematical terms)
[x] 6. Identifying associations between features
[x] 1. Mention that you only use the Bayes form? (if that is the only one you run)
[x] 2. Print associations in a cell as a table
[x] 3. Which perturbation are you doing? Please indicate this before
[x] 8. "Possible Issues" is numbered 8, but after point 9
[x] First give an overview of the steps that the student will do
[x] Are they using CPU or GPUs? Can they use GPUs?
[x] Mention that we only use a subset of the original data
[x] Explicitly write (in the beginning) what is it that we want to achieve. (Learning objectives)
[x] You sometimes use hmp2 and other times ibd for data and configs. This is confusing
[x] 2. Inspecting the data
[x] 3. Encoding the data
[x] 4. Hyperparameter optimization
[x] 5. Latent space ...
[x] 6. Identifying associations between features
[x] 8. "Possible Issues" is numbered 8, but after point 9