"passenger line insurance company" -> "passenger life insurance company"?
"4. After login in," -> "4. After logging in,"
"Note: If no experiments have been run then the list will be empty." This is under point 3, but should be under point 2 (Experiments).
"Join the H2O community on Slack to Ask Questions" This link doesn't work.
"about the latest at H2O.ai updates and more" (remove "at")
"Some of the major regulatory statutes potential governing these industries’" ("potentially"? or remove "potential)
"In part 2 of this blog series, we explore more techniques for enhancing trust in AI and machine learning models and systems.". In addition to being a tutorial, is this all being externalized as a blog as well, or this just a cut & paste issue and should be removed?
Similarly, there's a spot that says "and that’s what we will do in the rest of this post." Should that say "rest of this tutorial?" or be removed?
"This dataset contains the list of estimated*" Why the asterisk? I don't see it making reference to something elsewhere.
"attributes of the each passenger:" (remove "the")
It says the overall Titanic dataset is 1309 rows, with the training set being 978 and the test set being 332. Taking out headers, it should be 977 and 332 for a total of 1309 (i.e. just the training set size has to be changed).
For the test set it says "15 attribute columns representing attributes of the each passenger." It's actually 14 (or 13 if disregarding "survived").
And it talks about having removed "boat", "body" and "home.des" but the data files we download for the tutorial has them in there... which ultimately impacts the model training. They should be removed in the datasets that are downloaded or there should be steps to drop them for the experiment.
"View a summary the dataset or preview the daset ". Add "of" before the dataset and fix "daset" spelling mistake.
For the Dataset Details screen shot, with the Titanic training data, it shows a passenger ID column but that column doesn't exist in the Titanic_0.750.csv file. Not a huge deal, but might be confusing to somebody who's expecting it to look this way.
"Date columns are given a str type." "str" -> "string"
Under the Dataset Rows screen shot, in the Things to Note section, you have two #4s listed... the 2nd one should be removed.
In a few places you mention the file names "train_0.75.csv", "train_0.75.xlsx ", "Training_0.75" and "Test_0.25", but the files you have on AWS are called "Titanic_0.750.csv" and "Titanic_0.250.csv".
"other variables on the dataset." "in" the dataset?
"Time Column - Provides a time order(time stamps for observations) Hover over any of the yellow triangles for additional information" A couple of periods missing.
"Click Dropped Columns, drop the name(Miss., Mrs., and Mr.) column " This doesn't exist in the Titanic_0.750.csv dataset. Looking back at your original table near the top of the tutorial with the description of the Titanic dataset, it includes columns that aren't in Titanic_0.750.csv and Titanic_0.250.csv (the files we're asked to download later).
The settings it comes up with by default for me is 8/2/7, not 8/2/8 as shown. Would be useful to mention that what the reader sees might be slightly different and they should just accept the defaults rather than changing to match what you are showing (unless you want them to change it to that). Actually, probably best to include a statement near the beginning of the tutorial that says that over time the interface might change slightly, new capabilities might be added, etc. which means that what's shown in the tutorial might not match exactly with what the read sees, due to different versions, etc.
"Status of parameter tuning followed, feature engineering and scoring pipeline." "followed by"?
"(unable to adjust then while experiment is running)" "then" -> "them"
"These transformation created with the following transformers:" -> "These transformations were created..."
In regards to the ROC curve description in Task 7, "You can check this out on the graph above." Should this say "below"? Also, the discussion of LR and T4 makes no sense to me. I'm not well-versed in data science techniques, so it could be that I don't understand because I'm a newb, but I'm not seeing these values in the curve itself. Was this cut and pasted from somewhere else with a different graph?
Generally speaking, I find some of these details kind of deep. Are you targeting the lay person with this tutorial... data scientists... everybody? As an intro tutorial, I'd suggest removing or simplifying some of the deeper details in Task 7.
In the discussion of Lift Charts & Gains, it's talking about respondents and customers. It looks like there's some additional context that's missing here. Same with the Gains chart.
"Experimentpage" Missing space.
"Driverless AI predictions (middle, green, in " The sentence ends here... something missing?
"About this plot" Is there a reason for the ?
Missing colons after names for points 2, 3, 4, 5 in MLI Dashboard section
Does Task 9 need to be on its own or should it be combined with Task 10?
"Open the auto-generated pdf report and review the experiment results." I could be wrong (I don't have the newest level on our server to test) but I thought you now ship a Word doc instead of (or in addition to?) a PDF doc.
Good tutorial. Some review comments: