Open TheUsefulNerd opened 1 month ago
@TheUsefulNerd Two weeks is more than enough time for this project. Let’s aim to get it done in one week instead.
Ok, I will complete the project within a week and submit a pull request.
Is it necessary to use a .py file or can I use .ipynb files? For model.py and predict.py, because I will be adding notebooks where I will train my model.
Is it necessary to use a .py file or can I use .ipynb files? For model.py and predict.py, because I will be adding notebooks where I will train my model.
Yes, it is necessary to use the model.py and predict.py files for model definition and predictions to maintain consistency within the project structure. You can add your training and experimentation in Jupyter notebooks, place them in a dedicated notebooks/ directory. This way, we keep the code modular and organized while allowing for interactive development. Let me know if you have any other queries.
@TheUsefulNerd assigned
Hey @Pranshu-jais , I have been working on this model since a few days. I have got 60% accuracy using XGBoost, the highest among the models I used. Can I take 2 more days to improve the model's accuracy and then submit the PR? The dataset has around 1 lakh rows and 50 columns with numerical, categorical and missing values with outliers....so need time to improve the dataset.
@TheUsefulNerd Yes, you can .
@Pranshu-jais, @yashasvini121, I’ve worked on the dataset and achieved an accuracy of 89% using the Random Forest model. However, I’m facing challenges with the recall and precision metrics. Despite oversampling the data and implementing various resampling techniques, I’m still not getting the desired results. I’ve also consulted additional resources, but they address similar issues. Do you think it's acceptable for me to submit the PR?
Yes, you can submit, but make sure your follow the current project structure and provide model_details fxn
Ok, Thanks.
Last 2 question, when do you assign the level 1,2,3 label to the repos? and should I add the model_details fxn in the model.py file? @yashasvini121
The levels are assigned after the pr is merged.
Yes, you could do that, if you face any difficulty then you can instead add the proper model details fxn in your notebook as well.
Hey, I am unable to load the prediction form page on streamlit. When I am running my file or any file from "pages" folder, it shows that page_handler is not a directory or file: "ModuleNotFoundError: No module named 'page_handler'". Do you know howto fix this error? @Pranshu-jais @yashasvini121
@TheUsefulNerd, Sorry, but what do you mean by "running a file from the pages folder"? Additionally, you could push your work to your fork so that we can better understand your question.
To run the project as a whole, use the command streamlit run App.py
.
Apparently my model.joblib file is larger than 100 mb which I don't know how.....so its not letting me push the commit at all. @yashasvini121
You will need to compress your joblib file. Consider using the following command:
joblib.dump(your_data, 'your_data_file.joblib', compress=<2,3 etc>)
@yashasvini121 I have been trying to solve the issues for the last 7 hours and in the end, I get this:
git push origin master batch response: @TheUsefulNerd can not upload new objects to public fork TheUsefulNerd/predictive-calc error: failed to push some refs to 'https://github.com/TheUsefulNerd/predictive-calc.git'
Then on the streamlit page all the models are working but mine shows:
ModuleNotFoundError: No module named 'model'
I have checked the file order, I have checked the import statements, the functions used, and variable names...... I tried codes recommended by ChatGPT to enhance it but still got the same errors. I am unable to understand what to do.
I am sorry to ask so many questions, but I really don't know what to do here.
I can’t give a definitive answer without more details, so please share the error screenshot next time for better clarity. However, based on my understanding:
Your master branch might be behind, which might cause merge conflicts. So try this:
Push your changes to a new branch on your fork using:
git push origin HEAD:new-branch
Verify the size of your pickle file—it must be less than 100MB.
Regarding the model import error, I assume you’re trying to import model.py
into predict.py
(or another file). To fix this, ensure you are using the correct import syntax:
from models.<your-folder-name>.model import ...
For example:
from models.house_price.model import x
# Instead of: from model import x
Hope this works, let me know if you have any other issue.
@yashasvini121 Ok, so I deleted the complete repo from my device and cloned it again and made changes again. I am currently facing this issue:
my code:
I did not change any file location.
@TheUsefulNerd, could you also mention the command you ran?
Silly question, but I don't see any issues otherwise. If it still doesn't work, I'll clone it and try it myself.
Because it looks like you have kept the repo in a predictive-calc folder, make sure you run streamlit run app.py
and not streamlit run predictive-calc/app.py
@yashasvini121 I used the same command streamlit run app.py. I directly clicked on run button to run the .py file.
Also I successfully pushed all the code to my forked repo. Can you have a look at it once and tell me where I am going wrong? I pushed my code to "new-branch" Here is the link:
You cannot click the run button to run the files i.e. you cannot run the files individually. You need to run the whole app. So try streamlit run app.py
and then check if your page is working properly in the website. @TheUsefulNerd
Did that too:
Well, it's a spelling mistake error: Line No. 155, it should be Diabetes Readmission Prediction
ok now I can see the page there, but a new error occurred again, I was facing this error since 2 days:
Instead of from model import DiabetesModel
in predict.py
line 1, write:
from models.Diabetes_Readmission_Prediction.model import DiabetesModel
Wow! I did the same thing few hours ago and it showed me error and now it works!! I pushed the changes too!!
Thank you so much!!
https://github.com/TheUsefulNerd/predictive-calc.git What is the next step?
Welcome, Create a model_details fxn as well, either in the notebook or in the model.py. After that create the pr for review.
:) I got another error now:
Hoorah!! Fixed it. No more errors! Submitting the PR for verification. Thanks a lot @yashasvini121
Hi @yashasvini121 , So, I have been trying to improve my model for the last 5 days. I was able to improve it. Now, the problem is with the model file size (.pkl). I used compress = 9 and then it came as 169.3 mb which exceeds the 100 MB limit of github. I used ChatGPT and got this answer as an alternative:
Store the Model Externally (Preferred) One common practice is to store large files like models externally (e.g., on a cloud storage service) and reference them in your repository. This way, your repository stays light, and contributors can download the model if needed.
Google Drive / Dropbox / AWS S3 / Azure Blob Storage: Upload the model to one of these services, then provide a link in your repository’s README or code to download it when necessary.
will this be feasible?
Problem Description: The project aims to predict whether diabetic patients will be readmitted to the hospital within 30 days of discharge. Many hospitals struggle with managing diabetes properly, which leads to frequent readmissions. These readmissions increase costs for hospitals and worsen patient health. By predicting which patients are likely to be readmitted, hospitals can take preventive measures, improving patient care and reducing unnecessary costs.
Model Description: We will use a machine learning model called XGBoost, which is good at handling complex data and making accurate predictions. XGBoost is chosen because it performs well on medical data and can deal with situations where there are more non-readmitted patients than readmitted ones (class imbalance). The model will be trained on patient data, including demographics, medical history, and treatment details, to predict the likelihood of readmission. We will also use methods like SMOTE to balance the dataset and make the model more accurate.
Estimated Time for Completion: I will be taking 2 weeks of time to finish this project.
Expected Outcome: The model will help predict which diabetic patients are at high risk of being readmitted to the hospital within 30 days. This will allow hospitals to intervene early, reducing readmission rates, improving patient health, and lowering costs.