Closed ETA444 closed 6 months ago
predict_ml()
is designed to handle both statistical inference and predictive model selection based on the provided data and user preferences. It automates the process from data preprocessing to model tuning, offering a streamlined approach to model evaluation and selection using machine learning or statistical methods.
if not df.empty and formula:
# Proceed with statistical inference
elif x_cols and y_col and not df.empty:
# Proceed with machine learning model selection
else:
raise ValueError("Insufficient input data provided.")
if data_state == 'unprocessed':
processed_data = data_preprocessing_core(df, x_cols, y_col, test_size, random_state, numeric_imputer, numeric_scaler, categorical_imputer, categorical_encoder, text_vectorizer, datetime_transformer, verbose)
else:
processed_data = df[x_cols + [y_col]]
recommended_models = model_recommendation_core(processed_data, task_type, priority_metrics, n_top_models, cv, verbose)
tuned_models = model_tuning_core(recommended_models, task_type, priority_tuners, custom_param_grids, n_jobs, cv, n_iter_random, n_iter_bayesian, refit_metric, verbose, random_state)
if formula:
inference_results = model_recommendation_core_inference(df, formula, priority_models, n_top_models, model_kwargs, verbose)
return {
'ML_Models': tuned_models if not formula else None,
'Statistical_Models': inference_results if formula else None
}
ml_models = predict_ml(df, x_cols=['Age', 'Salary', 'Department'], y_col='Salary', verbose=2)
inference_models = predict_ml(df, formula='Salary ~ Age + C(Department)', verbose=2)
Title: Develop Automated Predictive Model Selection and Statistical Inference Tool
Description: This project aims to develop an automated tool,
predict_ml()
, for predictive model selection and statistical inference. The tool streamlines the process of data preprocessing, model selection, and tuning, providing recommendations for the best model based on user data and preferences.Proposed Changes:
predict_ml()
, capable of handling both machine learning (ML) and statistical inference tasks based on user inputs.Expected Outcome: Upon completion, the
predict_ml
function will serve as a comprehensive tool for automating predictive modeling and statistical inference tasks. By integrating data preprocessing, model selection, and tuning into a unified framework, this tool will enhance user productivity, facilitate informed decision-making, and streamline the entire modeling process.Additional Context: The proposed development addresses the growing need for automated tools that simplify and expedite the process of predictive modeling and statistical analysis. By offering a versatile and user-friendly solution, this project aims to empower users with the capabilities to efficiently analyze data and derive actionable insights, driving advancements in various domains reliant on data-driven decision-making.