Open kbuzard opened 2 years ago
Hours Worked Today: 1.6 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 14 Hours Worked this week: 3.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: meeting with Professor Buzard. Discovered that 1s can be read as a lowercase L.
Hours Worked Today: 3.6 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 17.6 Hours Worked this week: 3.6 Tasks that I am assigned: Finish digitizing 1989 Reflection: I got the list down to 1477 observations.
Hours Worked Today: 1.1 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 18.7 Hours Worked this week: 4.7 Tasks that I am assigned: Finish digitizing 1989 Reflection: down to 1356
Hours Worked Today: .7 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 19.4 Hours Worked this week: 5.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: down to 1287
Hours Worked Today: 2 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 21.4 Hours Worked this week: 7.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: down to 1124 and met with Prof. Buzard
Hours Worked Today: 1.3 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 22.7 Hours Worked this week: 1.3 Tasks that I am assigned: Finish digitizing 1989 Reflection: 1078
Hours Worked Today: 1.3 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 24 Hours Worked this week: 2.6 Tasks that I am assigned: Finish digitizing 1989 Reflection: 999
Hours Worked Today: 1.2 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 25.2 Hours Worked this week: 3.8 Tasks that I am assigned: Finish digitizing 1989 Reflection: 867 left
Hours Worked Today: 1.5 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 26.7 Hours Worked this week: 5.3 Tasks that I am assigned: Finish digitizing 1989 Reflection: 755 left
found a way to search for the .1 and .4 that were read in as A by control F-ing "A[tab]-" you cannot write tab into the search bar but if you copy and paste it then all cases will show up
If you find other big groups of the same mistake, see if you can write python code to fix it (it could be added to the script toward the beginning). This reminds me though...was there a separate script of Antonio's that had these kinds of fixes in it? I thought I remembered something like that...it doesn't seem to me like Dylan and Kelly had to spend as much time as you're spending, and I think it may be because Antonio wrote a script that fixed a lot of these issues programmatically.
Hours Worked Today: 4.1 (THIS NEEDS TO BE CHANGED IN TIME CLOCK) Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 29.3 Hours Worked this week: 7.8 Tasks that I am assigned: Finish digitizing 1989 Reflection: Down to 449. A-M is done except for the addresses I do not know how to solve. I am hoping that these will go away when I fix the rest.
Dylan and Kelly edited the addresses once they were Geocoded. I think Antonio did this before he gave it to them.
Timeclock is going under maintenance right now and I am not sure when the site will be up and running again, so I cannot clock out. I started at 7:42 and ended at 11:52 pm.
Hours Worked Today: 2.7 Total Hours Worked (Fall 2022 not logged): 4.5 Total Hours Work (Fall 2022): 32 Hours Worked this week: 10.5 Tasks that I am assigned: Finish digitizing 1989 Reflection: Down to 177
regex = r"\.[1-9]{1}[0-9]{0,1} .{25,150}?\. Tel:|[A-Z]{1}[1-9]{1}[0-9]{0,1}[0-9]{0,1} .{25,150}?\. Tel:"
to include shorter addresses that were not getting read in. The {n,m}? code will causes the regex to match from m to n repetitions of the preceding the .## or .A###, attempting to match as few repetitions as possible. Hours Worked Today: 3.2 Total Hours Work (Fall 2022): 35.2 Hours Worked this week: 3.2 Tasks that I am assigned: Finish digitizing 1989 Reflection: Down to 5. These five I have done the entry, print method and they do not exist. They are not being read in at all and I am not sure how to make them get read in. I did all of my previous tricks and retyped the letters.
Okay, let's look at those five when we meet tomorrow.
Hours Worked Today: 1 Total Hours Work (Fall 2022): 36.2 Hours Worked this week: 4.2 Tasks that I am assigned: Finish digitizing 1989 Reflection: Meeting with Prof. Buzard, FINISHED the FixThese and FixThese2.
Hours Worked Today: 1 Total Hours Work (Fall 2022): 37.2 Hours Worked this week: 1 Tasks that I am assigned: Finish digitizing 1989 Reflection: meeting with Prof. Buzard
Hours Worked Today: 1.9 Total Hours Work (Fall 2022): 39.1 Hours Worked this week: 1.9 Tasks that I am assigned: Finish digitizing 1989 Reflection: Here is my progress with striping the names out
df[column].apply(f) takes a function f as argument and applies that function to every value in column, returning the new column with modified values.lambda x: x[#:] defines a function that takes a value x and returns the slice x[#:]. I.e., when x is a string, it returns x without the first two characters.Hence, df['column'].apply(lambda x: x[#:]) returns the column modified by removing the first # characters from all strings in it.
Trials a = Matched_ID89['data_string'].apply(lambda x: x[Fac_Length:]) #Fac_Lenght needs to be an integer, is there a way to assign this # to the numbers in Fac_Length??
b = Matched_ID89['data_string'].str.slice(start=Fac_Length)#returns back a list of numbers :(
c = Matched_ID89['data_string'].lstrip([Fac_Length]) #lstrip doesn't like data frames and lists, only strings...
Hours Worked Today: 1 Total Hours Work (Fall 2022): 40.1 Hours Worked this week: 2.9 Tasks that I am assigned: Finish digitizing 1989 Reflection: Meeting
Hours Worked Today: 1.5 Total Hours Work (Fall 2022): 41.6 Hours Worked this week: 1.5 Tasks that I am assigned: Finish digitizing 1989 Reflection: I GOT RID OF THE NAMES and INCs. Some of the addresses are a little messed up but most of them are perfect! here is my code and I saved the most recent pdf.
no_name = [] for x in Matched_ID89['data_string'] : regex = r"^.*?," y = re.sub(regex, "", x) # removed everything before the first comma (greedily) no_name += [y]
no_nameClean = [] for x in no_name : x = re.sub(r'.INC,', '', x) no_nameClean += [x]
no_nameCleane = [] for x in no_nameClean : x = re.sub(r'.Inc,', '', x) no_nameCleane += [x]
no_nameCleaned = [] for x in no_nameCleane : x = re.sub(r'.Inc', '', x) no_nameCleaned += [x]
ALSO: Is there a way to move our meeting to Wednesday or Thursday this week? I will be on the road all day Friday.
@Kirs10-Riley: Great progress!!!
I can meet anytime tomorrow before 5pm. If sometime mid-afternoon would work, you can just pick a time and let me know.
@Kirs10-Riley: Great progress!!!
I can meet anytime tomorrow before 5pm. If sometime mid-afternoon would work, you can just pick a time and let me know.
would 4pm work?
Yes, 4pm is fine
Thursday, Dec 22 Hours Worked Today: 1 Total Hours Work (Fall 2022): 41.6 Hours Worked this week: 2.5 Tasks that I am assigned: Finish digitizing 1989 Reflection: Meeting
Hours Worked Today: 3.4 Total Hours Work (Fall 2022): 45.0 Hours Worked this week: 3.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: I created the data frame with the cleaned addresses and started working on the geocoding. I had to reconfigure the environment to run Geopandas. I left the last package running and started looking into the code.
Hours Worked Today: 5.2 Total Hours Work (Fall 2022): 50.2 Hours Worked this week: 8.6 Tasks that I am assigned: Finish digitizing 1989 Reflection:
Here is the current CSV for the geo-coded addresses Geocoded-Ad89 - Sheet1.csv
Hours Worked Today: 4.6 Total Hours Work (Fall 2022): 54.8 Hours Worked this week: 4.6 Tasks that I am assigned: Finish digitizing 1989 Reflection:
@Kirs10-Riley I see what you mean about the geocoded points being in weird places. But why would anything have changed at all? If the data_string column didn't change, I don't see why anything would change with the geocoding. Are you able to specify which column it uses to perform the geocoding? If not, it's possible that the new columns confuse it somehow.
I spot checked a few and agree that map is not showing things where they're supposed to be. When I filtered on NY for the state, NONE of the pins were in New York. So I think the problem is that somehow the map got disconnected from the table.
At the same time, I ran across something that I don't know how to interpret. At some point, the Address column stops matching up with the data_string column. Here's an example:
YOUSSEF & ASSOCIATES CONSULTING ENGINEERS, 1001 Spring St, Silver Spring, MD 20910 | 418 First Ave S, Seattle, WA 98104 | 47.59885 | -122.334 | ['g, MD'] | ['20910'] |
---|
If you can't get the map working, I think you can pull the file into ArcGIS to do the state-by-state checking.
Hours Worked Today: 2.7 Total Hours Work (Fall 2022): 57.5 Hours Worked this week: 7.3 Tasks that I am assigned: Finish digitizing 1989 Reflection:
The updated sheet looks much better.
Assuming we're meeting as usual tomorrow, let's brainstorm about the map then.
Hours Worked Today: 2.1 Total Hours Work (Fall 2022): 59.6 Hours Worked this week: 9.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: meeting plus fixes to the addresses.
Hours Worked Today: 1.1 Total Hours Work (Fall 2022): 60.7 Hours Worked this week: 1.1 Tasks that I am assigned: Finish digitizing 1989 Reflection: 1.2 meeting and some geo-coding
Hours Worked Today: 2.4 Total Hours Work (Fall 2022): 63.1 Hours Worked this week: 3.5 Tasks that I am assigned: Finish digitizing 1989 Reflection: fixed some of the addresses. having some problems geocoding them. I left it running.
Hours Worked Today: .8 Total Hours Work (Fall 2022): 63.9 Hours Worked this week: 4.3 Tasks that I am assigned: Finish digitizing 1989 Reflection: I re-ran it again and some other international ones popped up. I fixed those and changed all "OB" state abbreviations to "OK". I started fixing a couple of typos that I noticed.
Hours Worked Today: 2.1 Total Hours Work (Fall 2022): 66 Hours Worked this week: 2.1 Tasks that I am assigned: Finish digitizing 1989 Reflection: Meeting + started going checking all of the addresses by state. I estimate that there will be about 20 addresses that are in the wrong place.
Hours Worked Today: 2.3 Total Hours Work (Fall 2022): 68.3 Hours Worked this week: 4.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: Finished going through the addresses. There are 18 "wrong" addresses, I corrected about 25 of the addresses on top of that and re-ran the program.
Hours Worked Today: 2.3 Total Hours Work (Fall 2022): 70.6 Hours Worked this week: 2.3 Tasks that I am assigned: Finish digitizing 1989 Reflection: fixed the addresses, updated the documentation on the geocoder89.py, and had a meeting with prof. Buzzard.
Hours Worked Today: 2.1 (forgot to clock out and emailed Ashley) Total Hours Work (Fall 2022): 72.7 Hours Worked this week: 4.4 Tasks that I am assigned: Finish digitizing 1989 Reflection: imported the csv back into the geocorder89 script and made it to row 130 on geocoder89.py.
I ran into a little problem. Geocoder.py needs the Labs1998 equivalent for 1989 but I cannot find how this was made.
Hours Worked Today: 2.9 Total Hours Work (Fall 2022): 75.6 Hours Worked this week: 7.3 Tasks that I am assigned: Finish digitizing 1989 Reflection: Made a map using folium I added an image of the map zoomed out below. each point is attached to the facility name. The python scrip is under 1989 python scripts as map_attempt1.py and the map is saved as map1 under 1989. This map does take a while to zoom out. this could be because I did it on rds. It's a little messy looking and I am trying to make a map using geopandas but loading in the USA block took too much time. I left it running, but I am not sure if once rds closes the program stops running.
@Kirs10-Riley Definitely don't use the block shape file then! I've tried that before and it's a mess! Try Googling to find a shape file that just has the state outlines. There should be lots of them out there...
Then see if you can change the markers so that they are small enough that they don't overlap each other.
This is a great first step!
Hours Worked Today: 1.2 Total Hours Work (Fall 2022): 76.8 Hours Worked this week: 1.2 Tasks that I am assigned: Finish digitizing 1989 Reflection: meeting
Hours Worked Today: 3 Total Hours Work (Fall 2022): 78.6 Hours Worked this week: 4.2 Tasks that I am assigned: Finish digitizing 1989 Reflection: did two more map attempts ended with map4 found in the 1989 folder I took a screen shot of the map below. Let me know what your thoughts of it are. for this map, I fixed the starting zoom, changed the labels to dots, and if you click on the dot it will tell you the lab name. I cannot figure out a way to get Alaska and Hawaii to be closer to the US. so unless I remove those states and create two separate mini maps the zoom is going to have to be pretty far out to get all of the US.
This map is GREAT! Exactly what we need!
@Kirs10-Riley : please log your work daily here using the following format:
Tuesday, May 31