Open linicole opened 4 years ago
The public health department has access to Emergency Department data through Syndromic Surveillance https://www.cdc.gov/nssp/overview.html I wonder if similar to IDOT, IDPH provides some of the hospital data publicly. I'm aware of some research done by merging crash data and the hospital billing data.
In addition, the "HospitalName" field in Person set contains lots of code-type data instead of hospital names, and there's no elaboration in the data dictionary doc. Thanks!
I was going to check with the people at the State, but it's clearly just messy data. I can't imagine a data dictionary helping here.
There are 3,322 hospital name entries here are the top 20 entries:
> people[ , .N, HospitalName][order(-N)][1:20]
HospitalName N
1: 1906337
2: REFUSED 26630
3: DNA 11967
4: NONE 6192
5: 99 3421
6: MERCY 1936
7: DECLINED 1645
8: ST. BERNARD 1507
9: MT SINAI 1320
10: HOLY CROSS 1308
11: STROGER (COOK COUNTY) 1250
12: UNIVERSITY OF CHICAGO 1124
13: UNKNOWN 1062
14: STROGER 1002
15: NORTHWESTERN 924
16: REFUSED EMS 822
17: ROSELAND COMMUNITY 805
18: COMMUNITY FIRST 803
19: NORTHWESTERN MEMORIAL 717
20: LORETTO 704
This gives you a sense for what the head / tail of the hospital names look like:
> people[ , .N, HospitalName][order(-N)]
HospitalName N
1: 1906337
2: REFUSED 26630
3: DNA 11967
4: NONE 6192
5: 99 3421
---
3318: REFUSED- EMS ON SCENE 1
3319: REFUSED - EMS ON SCENE 1
3320: REFUSED MEDICAL ATTETION 1
3321: LUTHERN GENERAL 1
3322: NORHTWESTERN 1
I've used the qlcMatrix
in the past to examine string distances. I'm attaching a text file (which should really be a .R file) where I developed a function that checks an input list to see what it most closely matches.
mmatch.txt
I think this function could work, but we'd need a master list of hospitals and their locations.
I think it would nicer to do some unsupervised learning on the data and look for natural clusters by name, and then see if the similar name clusters are also geographically concentrated. You'd think that certain areas would depend on certain hospitals. However, they might also use more distant hospitals that have (for example) superior trauma units or specialties.
Hi, I found a list of trauma centers here:
hospital_list.zip - I created a csv file from the link above.
Map of Trauma Centers - Probably need to find a way to download for Illinois.
I have a fuzzy matching script in R that can match with a full list of hospitals and trauma centers using the Levenshtein and Jaro distance (stringi and stringdist packages). We also need to create another column to identify if it is a hospital or maybe we need to fully clean the HospitalName column. I will work on it this week.
Thanks @hneaz I've also reached out to CDPH. Let's see what they provide. I can ask if there is a downloadable map.
CDPH shared with us this website that has a list of all the hospitals along with their addresses: https://data.illinois.gov/dataset/410idph_hospital_directory/resource/9bdedb85-77f3-490a-9bbd-2f3f5f227981
One can export that data to csv
In case there isn't a way to export the map that @hneaz mentioned, we can geolocate the hospitals through their addresses. I don't know if there's a tool in R for that, but let me know if we need to do it in GIS and I can do so using address locator (at least for Chicago).
@sas1336 Thank you for the link. This is very useful. I can start working on cleaning up the hospitals names and also try out @geneorama's script.
If we need the lat/long for the hospitals, I can get them using SmartyStreets Python API by uploading the file from the link provided. I have the unlimited subscription plan from work that I can use to get clean addresses with lat/long.
[UPDATE] I got the latitude and longitude data for the hospitals. Please find attached. illinois_hospital_list.csv.zip
The topic that I would like to dig into is about the hospital and emergency service for the traffic crashes in Chicago. But right now, there is a lack of data around this subject. So I would like to request or see more information in the public datasets. Thank you!