JuliaAI / DataScienceTutorials.jl

A set of tutorials to show how to use Julia for data science (DataFrames, MLJ, ...)
https://juliaai.github.io/DataScienceTutorials.jl/
MIT License
116 stars 18 forks source link

Towards validating tutorials render correctly on the website #204

Closed EssamWisam closed 7 months ago

EssamWisam commented 8 months ago

Any changes required to make them do will be made in this PR.

EssamWisam commented 8 months ago

I expected to fix the plots for the files that were broken earlier hence the branch name but after running them locally now (and checking the corresponding PRs) seems like @ablaom has fixed them as well. Thank you for that.

In the commit above I add missing Project.toml and Manifest.toml files for the new credit fraud tutorial.

EssamWisam commented 8 months ago

In the commit above non-code in seven tutorials was modified to let serving pass. This includes exactly two things:

Note that the fact that serving passed doesn't imply that everything will render correctly as code errors are treated as warnings. This is what I will be looking into now.

EssamWisam commented 8 months ago

Data Basics Equivalence Check

Check is done by comparing old and new website side by side for the specific tutorial.

Tutorial Name Check Results
D0-categorical Success
D0-dataframe Success
D0-loading Success
D0-processing Need to fix Franklin Warning Failed to precompile Plots. Success; also had to addCSV.
D0-scitype Success

A note about the error above is that (most) plots still show successfully but the warning shows as an error in the notebook. Edit: NVM. It shows old plots of the same name...

EssamWisam commented 8 months ago

Getting Started Equivalence Check

Tutorial Name Check Results
A-model-choice Success
A-fit-predict Success
A-model-tuning Success but minor inconsistency due to RNG
A-ensemble Success but minor inconsistency due to RNG
A-ensemble-2 Success but very minor inconsistency due to RNG
A-composing-models Success but actually modified
A-ensemble-3 Success but actually (minorly )modified
Stacking Success but actually modified
EssamWisam commented 8 months ago

Introduction to Statistical Learning Equivalence Check

Tutorial Name Check Results
Lab 2 Success with minor RNG issue
Lab 3 Success
Lab 4 Success after typo fix
Lab 5 Success
Lab 6b Success with likely RNG issues
Lab 8 Success
Lab 9 Success
Lab 10 Success
EssamWisam commented 7 months ago
Tutorial Name Check Results
EX-telco Success but actually modified
EX-GLM Success
EX-airfoil Success
EX-boston-flux Success
EX-breastcancer Success
EX-horse Success except for slightly inconsistent conclusions in A baseline model section
EX-housekingcounty Success
EX-powergen Success
EX-wine Success ~except for a very minor conclusion in Getting a Baseline section~
EX-AMES Success ~but may be misunderstanding something as an inconsistency~
EX-crabs-xgb Success ~with a plot-conclusion inconsistency in XGBoost Machine as an exception~
⛔ EX-boston-lgbm Unable to run full tutorial due to NULL library handle
EssamWisam commented 7 months ago

Summary of Checks:

I have also added a commit where I have increased the font sizes for some tutorials. Some tutorials had ordinarily-sized and legible text plots so I didn't touch them. Otherwise, I increased font scaling then viewed the tutorial again. I don't always scale with the same number because when plot(machine) is present a smaller font is preferred to prevent text from clashing with the axes.

EssamWisam commented 7 months ago

Take note that plots may appear larger than expect while serving. I wasn't able to solve this by changing the figure size with Plots.jl but was able to solve it by limiting image sizes via CSS (but I didn't push any style changes).

EssamWisam commented 7 months ago

Will say that for me, Franklin's serve() with an empty directory doesn't behave the same as serve() followed by serve(eval_all=true) which is surprising (and took me time to figure out).

Think I am now done with validating the tutorials and plots for all tutorials except for LightGBM.

I will now move on to other PRs related to the navigation bar items or styling.

ablaom commented 7 months ago

@EssamWisam Are you able to resolve the conflicts here? Or is this PR now redundant?

EssamWisam commented 7 months ago

@EssamWisam Are you able to resolve the conflicts here? Or is this PR now redundant?

This is the most important PR on the table xD. I think solving the conflict should be easy. It's just because in the other PR we merged already I moved A-ensembles-3 to advanced.

ablaom commented 7 months ago

I found some broken links here (first two):

Screen Shot 2024-04-03 at 12 11 41 PM

But actually as these resources are now quite old, let's just remove them (keeping the preceding link to Telco). I'm probably not going to update them soon.

ablaom commented 7 months ago

The link to external resources points to a non-existent file. You can find what was planned for this here.

Let's remove "MLJTutorials" from this list, and all other references to it, as it is quite old.

ablaom commented 7 months ago

A small annoyance is that switching tabs on the landing page is not permanent, in the sense that going back after following a link always returns you to the "New to Julia?" tab. If this is a lot of work to fix, however, let's just create an issue to flag it for later.

ablaom commented 7 months ago

The first link here is broken:

Screen Shot 2024-04-03 at 12 24 44 PM

And "top of this page" no longer makes sense. Maybe just "click here".

ablaom commented 7 months ago

A bunch of links from the tabs to intro to stats learning are broken. I think there is a spelling mistake (my bad):

Screen Shot 2024-04-03 at 12 26 55 PM

("learning" not ".earning")

EssamWisam commented 7 months ago

@ablaom The commits above address all the commends you mentioned. I tried to be cautious while testing so I can merge this directly.