Aditi99b commented 7 months ago

Aditi Bharadwaj Period 3 Lopez

Aditi99b commented 6 months ago

https://github.com/mwaskom/seaborn-data/blob/master/process/titanic.py

Aditi99b commented 6 months ago

import seaborn as sns import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn.preprocessing import OneHotEncoder from sklearn.compose import ColumnTransformer from sklearn.pipeline import Pipeline

Load the attention dataset using Seaborn

attention_data = sns.load_dataset('attention')

Display the columns and a sample of the dataset

print("Attention Data Columns:") print(attention_data.columns) print("\nSample of the Attention Data:") print(attention_data[['subject', 'attention', 'solutions', 'score']].head())

Split the data into features (X) and target (y)

X = attention_data[['subject', 'attention', 'solutions']] y = attention_data['score']

Define a preprocessing pipeline

preprocessor = ColumnTransformer( transformers=[ ('cat', OneHotEncoder(), ['attention']) ], remainder='passthrough' )

Define the linear regression model

regressor = LinearRegression()

Create a pipeline with preprocessing and linear regression

pipeline = Pipeline([ ('preprocessor', preprocessor), ('regressor', regressor) ])

Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Fit the pipeline on the training data

pipeline.fit(X_train, y_train)

Define a new observation (similar to a passenger)

new_observation = pd.DataFrame({ 'subject': [5], # Put any subject ID 'attention': ['focused'], # Divided or Focused 'solutions': [3] #1, 2, or 3 })

Predict the score for the new observation

score_prediction = pipeline.predict(new_observation)

Print the predicted score

print('\nPredicted Score:') print(score_prediction[0])

Aditi99b / notebook

Collegeboard MC #6