Notes from June 2019 workshop

jorisvandenbossche commented 5 years ago

General:

first morning: mention that people need to download latest version of the material (idea: tag to make a release before the course)
is %matplotlib inline still needed?

Slides:

Pandas 1:

Pandas 3 selecting:

Pandas 3b indexing:

[x] update setting with copy warning section with more explanation (temporary variable, boolean selection on population, show both and that the one does not work countries.loc[countries['population'] > 50, 'population'] = 50)

Pandas 4:

Pandas 6 - groupby:

pandas 7 reshape:

Visualization - plotnine:

Visualization - landscape:

Case 1 - bike count:

Case 2 - biodiversity processing:

[x] groupby doesn't count NaNs, value_counts does -> add a note to explain that difference?

Case 2 - biodiversity analysis:

Case 3 bacterial resistance:

[x] intial tidying: we loose the "experiment id" or "repetion id" in the original data (multiple repeptitions for same phage / genotype, which now is a single row) -> that information is lost
[x] creation of density_mean -> select column 'optical_density' before .mean() -> no mean of survival etc .. (in solution)

stijnvanhoey commented 4 years ago

stijnvanhoey commented 4 years ago

@jorisvandenbossche can we close this issue and maybe move the remaining question to a separate issue?

jorisvandenbossche / DS-python-data-analysis