WeberLab-UW / 2022-Election-Material

Fork of @BlakeRMills repo for Web scraping, candidate data, and election visualizations - to modify for governors and mayors races 🤞
0 stars 0 forks source link

Daily time tracker #6

Open nniiicc opened 1 year ago

nniiicc commented 1 year ago

Please leave a comment here with what issue you worked on, and your progress. Record the number of hours you spent on the issue. Also include what you plan to work on / accomplish tomorrow.

peiwenf commented 1 year ago

Today (Total of 4h) Work on Issue #4: Set up the machine and tested the "make report function" by making reports locally (1.5h) Asked Eva about the basic information about the project structure (0.5h) Read through the files in GitHub (1.5h) Work on Issue #3: Read through the methodology of The Markup (0.5h) Tomorrow: Try to schedule a time with Eva to go over the project, still confused after reading the files. Start generating reports (Might need to change the script to fit our goal before doing this)

peiwenf commented 1 year ago

Sorry I forgot to track my work yesterday, here is a tracker for both 1.10 and 1.11: 1.10 (Total of 4 hours) issue#4 Meet with Eva to learn about last year's project (1h ) run the generate report function in a loop locally (2.5h) I figured that my computer can't handle the work locally, so I will try to run the git actions locally issue#3 tested project with gui, studied blacklight-collector repo, learned about npm (0.5h) 1.11 (Total of 2 hours, will make up tmr!) issue#4 Setup GitHub CLI on my computer, write the script for running the GitHub action locally, studying the download function (1h) issue#3 Did more research on npm, ran the blacklight-collector locally, and tested with a website (1h) Plan for tomorrow : get admin access to the repo run the GitHub action locally get the reports for attorney.csv down

peiwenf commented 1 year ago

issue #4 (2h) Runs the GitHub actions locally and got the reports for the majority of the candidates in Attorney General Races Tomorrow Figure out the way on clean the data frames (depends on the answer to the question about issue#4) get the analysis down for Attorney Generals

peiwenf commented 1 year ago

issue #4 (4h)

Tomorrow

peiwenf commented 1 year ago

Issue#4(4h)

Tomorrow:

peiwenf commented 1 year ago

Issue#4 (4h)

Tomorrow:

peiwenf commented 1 year ago

issue #4 (2h)

Documentation(2h)

Tomorrow

peiwenf commented 1 year ago

Issue #4 and Issue #5 (4h)

peiwenf commented 1 year ago

1/25 & 1/26 Issue #4 (8h)

Tomorrow

peiwenf commented 1 year ago

1/27 ISSUE #4 (4h)

peiwenf commented 1 year ago

ISSUE #4 (4h) Things tried:

peiwenf commented 1 year ago

1/31 & 2/1 Issue #4 (8h) Things tried: tried to import the modules from other folders but failed, temporarily put all the files in a same folder worked for testing, and will solve the path problem tomorrow Things done: fixed generate report wrote the script to get the voting information combined the axe-scraped data with the original data Tomorrow: reinstall the repo to make the path work finish analyzing for Attorney General and start on another race

peiwenf commented 1 year ago

2/2 Issue #4 (6h) Things tried: Things done: recreated an environment to make the relative path work regenerated the data frame to add the word matrix debugged the analysis script till line 765, most of the bugs are caused by empty values from the candidates who don't have a campaign page Tomorrow: planning on creating two datasets. One contains all the candidates who don't have a campaign page, one contains only the candidates that have a campaign page for the convenience of analysis Finish analyzing for attorney general and finish house elections

peiwenf commented 1 year ago

2/3 Issue #4 (7h) Things done: Created two datasets Got the analysis for the attorney general data frame and fixed all the non-scene data line NaN values or unnecessary plots Tomorrow: Ask about the next step Replicate analysis for the other races

peiwenf commented 1 year ago

2.6 and 2.7 Issue #4 (11h)

Tomorrow: Plan to work at least 7 hours

peiwenf commented 1 year ago

2.8 Issue #4 (4h+4h running code) Wrote a 1st draft of getting voting data with the logic of getting 1 race and then moving to the next one, then realized the special case of retention elections (asked in the issue) and figured the city election is not well formmated as other datasets, so I might need to handle each candidate individually Reduced the unique race for city elections from 70 to 50 aiming to reduce more after hearing back from the issue Generated the study data for Governors which took 4.5 hours Things need help with @nniiicc : Combining axe data with the original data for Governor took more than 4 hours, and this is not the biggest data frame. When the terminal is running, my computer is getting really slow so I can't multi-task on other tasks. The only solution I can think of is to divide the largest data frame into 5 pieces, and then concatenate all parts together. I was wondering if there are any other suggestions. Tomorrow: Finish analysis for Governor Prepare the report generated by House for analysis Clean the City Elections and Municipal Elections' race column Get voting results for City Elections and Municipal Elections

nniiicc commented 1 year ago

@peiwenf

Combining axe data with the original data for Governor took more than 4 hours, and this is not the biggest data frame.

How are you combining the data? Have you tried using Google Collab where you get an extra GPU?

peiwenf commented 1 year ago

@nniiicc I combined the data by locally running this script https://github.com/peiwenf/campaign-access-eval/blob/2022dev/access_eval/bin/generate_access_eval_2022_dataset.py. I will look up Google Collab!

peiwenf commented 1 year ago

2/12 Issue #4 (5h)

Things need help with: @nniiicc When I try to push the house reports to GitHub, the system warned me about the file is larger than the 100 MB limit(it's around 286MB), so I tried to solve it by setting up the large file system (LFS). And when I run it with the lfs, I received the following message. I was unsure if I should purchase a data plan.

(bits) CicideMacBook-Pro:campaign-access-eval fpw$ git push
batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Plan for today

nniiicc commented 1 year ago

@peiwenf - why not just split the reports up into three batches - so that they can be run with the action, and then recombine them after the action is complete?

peiwenf commented 1 year ago

I have ran the action locally, I was just unsure if we want a copy of this data in the Github.

nniiicc commented 1 year ago

Got it -we should duplicate the storage somewhere - for now you can just split it into 99mb junks (for example house-1 and house-2)

peiwenf commented 1 year ago

2/13 &2/14 (8h) Issue #4

Tomorrow

peiwenf commented 1 year ago

2/15 & 2/16 (8h) Issue #4

Today

peiwenf commented 1 year ago

Issue #4 2/17(6h) & 2/20(6h)

Today

peiwenf commented 1 year ago

Issue #4 2/20 (6h)

Things need help with

Today

peiwenf commented 1 year ago

issue #4 2/22, 2/23, 2/24 (12h)

Tomorrow:

peiwenf commented 1 year ago

issue #4 2/28, 3/1, 3/2 (12h)

peiwenf commented 1 year ago

Issue #3 3/3 (5h)

Tomorrow:

Question @nniiicc :

peiwenf commented 1 year ago

Issue #3 (4.5h)

Tomorrow:

peiwenf commented 1 year ago

Issue #3 (16h) 3.7, 3.8, 3.9, 3.10

Today

peiwenf commented 1 year ago

Issue #3 (4.5h)

Tomorrow:

peiwenf commented 1 year ago

Issue #7 (15h) 3.14, 3.15, 3.16

Tomorrow:

peiwenf commented 1 year ago

Issue #7 3.20, 2.21 (12h)

Tomorrow:

peiwenf commented 1 year ago

3.22, 3.23, 3.24 (17h)

Plan for the week of 4.3 (Not in town for the next week)

peiwenf commented 1 year ago

4.3 (2h)

peiwenf commented 1 year ago

4.5, 4.6, 4.7 (14h)

peiwenf commented 1 year ago

4.10, 4.11, 4.12 (12h)

peiwenf commented 1 year ago

4.13 - 4.19 (20h)

This week:

peiwenf commented 1 year ago

4.20 - 4.27 Finished the analysis on the dataframe:

Today: find the number of total google trackers update the blacklight repo update the plots in drop box Do literature review for the common errors

peiwenf commented 1 year ago

4.28 - 5.2 (12h) Finished all the analysis parts and organized all the scripts

peiwenf commented 1 year ago

5.3 -5.9 (16h)

peiwenf commented 1 year ago

5.10-5.12 (16h)

peiwenf commented 1 year ago

5.15-5.18 (12h)