fsfrazao / DegDay

Degrees per day project for CMSC6950 course
0 stars 0 forks source link

On the Linear Regression Analysis #8

Open djo504 opened 8 years ago

djo504 commented 8 years ago

Hi @farayola

Any headways about the linear regression task? If you need any help, let me know or we could work on it together for more speed if you would not mind. Merci

farayola commented 8 years ago

Thanks Dayo, Will push what i have done today. And yes!, need loads of help...will articulate them and send

farayola commented 8 years ago

Hi Dayo, I have pushed what i have done so far. its presently unpolished but it works. What the code does is listed below: -It accepts the path to an import directory(the assumption is that the directory consist of import file that already has their GDD computed.Thus, ends with _GDD.csv.) -It merges all the file in the directory into a single .csv file called data -It groups the file by year and sums up the corresponding GDD -it then plots this on a graph(this answers the first part of the question that says to compare GDD year by year) -Next, it carries out some regression analysis and prints out the summary(the summary is all comprehensive(it has the intercept, r-squared, slope, level of confidence, and you can construct the model from the code)

What is left: -Polish the code by adding useful comments in the required format. -Include an input directory and output directory in line with our spec -Use real data to see if there is any pattern, then publish this in the latex report.

Let me know your thoughts and where you can help with the outstanding tasks. Thanks.

fsfrazao commented 8 years ago

Great Job @farayola!!! Just one observation: As the linear model should appear in the report too, we need a way to incorporate it on the Latex file instead of printing the summary on the screen. One way would be to save the model parameters on a text file and then get the latex code to read from that. I'm not very experienced in Latex but that sounds unnecessarily tricky to me, so for the t_baseXGDD regression I simply included the model description in the plot. And I think it's useful to plot the line of best fit too.

Have a look in my tbase_GDD_analysis.py code.

djo504 commented 8 years ago

Hi farayola

farayola commented 8 years ago

Hi @Fabio,

I have tried to add the suggestions. I have this challenge left: using arg_parse to supply the input and output directory. I tried and it seem to run in command line without yielding any output byt works well in ipython. Can you help take a look?

farayola commented 8 years ago

And this is not just for @Fabio,Other team members can take a look. I cant proceed any further as i have to start working on the latex. Thanks in anticipation

bfsfrank commented 8 years ago

@farayola Hi, I have pushed one modification to your linear function; it works, but I cannot make sure the result is right. Please check it if the output png in output folder is correct.

Hope it works well!

fsfrazao commented 8 years ago

@cj4755 I added a command line interface. Just like the other programs, it take an input directory and the path to the output figure: \ python linear_regression.py ./linear-csv-cum-input ./output/linear_model_plot.png**

I tested the the way it was and the plot/model were wrong not because any of the changes you made, but because the input data was missing (there's no data for that station after 2012, so artificial zeros were added to the dataset).

I also changed the input file so it takes more (and valid) years of data.

fsfrazao commented 8 years ago

@farayola, are you fine working with the textual output of the model?

bfsfrank commented 8 years ago

@fsfrazao Cool, I also have updated the running rule of makefile of linear regression.