Open wwyws0000 opened 4 years ago
Topic ideas: Purpose : (Find an area for food and explore (not like yelp where you just find a restaurant and go))
Borough/ Neighborhood most common cuisine
Zipcodes with the lowest score voilations
slider bar for highest average violation per zipcode/Borough by cuisine type
slider bar for lowest average violation per zipcode/Borough by cuisine type
Select search by cuisine and use radio buttons to search by grade and slider for what percentage of a zipcode/borough has those grades or higher, hover over map to show number of restaurants and can pinpoint restaurant locations in those neighborhoods
Thanks Adam, not sure our purpose should be that broad, since the dataset is explicitly on food safety concerns, whereas food exploration depends on many other variables. So I feel that unless we are incorporating another dataset with ratings, etc, we should focus purely on the food safety aspect
This is good if we're sure that the dataset is intended to include all the restaurants in existence, not just those with certain violations (otherwise the information is quite biased)
Perhaps we can show this via the heatmap?
3,4,5: I find these a bit difficult to understand. Could you elaborate?
I've made a new commit where I started a sample app that illustrates my thoughts on the structure of our project/app. The ui and server code is under "Wenyue doc". The structure is as follows:
Overview tab. I've already done a partial implementation. It has two sub-tabs. On the "Top Violations" subtab, the user can see a table and barchart that shows the top X types of violations, filtered by Cuisine Type, Borough, and Severity. On the "Inspection Score" subtab, the user can see a distribution/histogram of the inspection score, filtered by the same criterias .
Map Tab. Here we can implement ideas and visualizations based on maps. For now i can think of stuff like heatmap of average inspection score by zipcode, etc.
Restaurant Lookup. Here we can show some more detailed information on a single (or maybe multiple?) restaurants, such as address, cuisine type, historical inspection issues, most recent inspection grade, etc.
Hi Wenyue,
For 3 and 4, I was thinking that grouping by zipcode we can see which cuisines are known to have the most and least violations (a heatmap maybe). The slider bar I don't remember what I was thinking to pitch that idea.
5 was similar but I was thinking like a restaurant search. As an example, looking for zipcodes where 60% of Chinese restaurants have Grade B or higher.
Question: Who do you think the app is for? Based on what you submitted, is the app for inspectors? Would they use the app to make it easier to finding restaurants that might have violations ? Or is it for everyday people?
Hi guys, sorry that I have interviews these days. I would start my work on weekend.
Hi Adam,
For 3 and 4 that's what im thinking as well. Heatmap of average scores by zipcode. Or we can do a dot overlay over the map where each dot represent a restaurant, and the color/size of the dot changes according to their score. Either way should be good.
IMO the app is for anyone anyone who wants to know inspection results for a specific restaurant or groups of restaurant. Could be an individual who just wants to see if the restaurant they're heading to is graded well, or it could be an inspector who wants to see some trends/analysis.
From the practical point of view, most of the app's users are those who wanna choose a perfect restaurant to have a dine, so basically they would consider the following factors:1. cuisine type; 2. location; 3. food menu, reviews, price; 4. violations (food safety, hygiene). And they would rank each factor based on personal preference.
So as for the dataset Wenyue provided, to better sort out the violations info, my suggestions are:
Present an overview outlook: a. the percentage of each Grade category filtered by "cuisine type", "zip code" or both; b. Top or Grade A restaurants filtered by "cuisine type", "zip code" or both; c. the percentage of "cuisine type " filtered by "Grade", "zip code" or both.
Provide more details into the specific restaurant: a. as Wenyue says above, mark all the restaurant on the map, and the dot size or color corresponds to the Grade. This can also be filtered by "cuisine type", "borough"/"zip code", and the map only shows dots that meet the set conditions. b. If users are interested in a particular restaurant, they can either type the name on search bar or click it on the map, and the more detailed info about that restaurant will present in a table incl. street, phone, cuisine, all violation code/ description, grade, etc. ( I notice that each restaurant may be cited for many times, so maybe when we present the detailed info of a restaurant, we can calculate the average score and then assign it a grade.)
As for the dataset Na provided, we can either merge the two datasets by index(restaurant maybe? but it seems not that easy) or design a separate sector for this.
This is my idea right now, hope you guys can combine my suggestions into the outline, once then design the overall layout based on that outline. Thanks for reading and have a great weekend!!!~
Crystal - did Na provide a dataset somewhere? I think I missed it.
In the email chain we have Na posted this dataset: [(https://data.cityofnewyork.us/Health/DOHMH-MenuStat/qgc5-ecnb)]. Also she said she was working on the app in the email but i don't think she committed it yet.
yup, it's about the menu and food nutrients.
Oh okay I found it now. The only problem I see is that the data only has information on nutrition facts of the menu items. I'm not sure if that's enough to make our app a "restaurant recommendation" app because
Yes, you are right, so I was wondering whether we could use index like the restaurant's name to match some info from the first dataset, and use this dataset to create another page mainly focusing on nutrients.
In this way, we could set our app's theme as food safety and nutrition facts, which could be categorized as a whole topic as "Healthy diet", and thus target our users as those who care much about the healthy diet and wanna make sure the restaurant they choose is safe and healthy. And I think that'll let our app's content be more rich than just telling the violation story.
Hi guys, Sorry for the late reply, I didn't know we were discussing on Github. After tryed to map the restaurant and also merge the two dataset by name, I found that the menu dataset are of some chain stores like 7-eleven. I apopogize for my misleading and wasting of our time. I am sincerely sorry for the wasting of your time. After searching on NYC OpenData for several keywords about restaurant, I don't think there are other data avaliable. I think we can focus on the data provided by Wenyue.
Not a problem Na. I couldn't find any other restaurant datasets either so I'll only focus on the violations.
I just committed my updated code where I enhanced/completed some of the features I mentioned last time. Mainly, I changed the Overview tab to Comparison where users can compare top violations, inspection scores, and grade breakdowns using two set of filters.
Top Violations:
Inspection Score:
Grade Breakdown:
Let me know if you guys have any suggestions!
Hey guys, I added a leaflet map to the map tab. I'm still trying to figure out how you add boundaries around certain zipcodes/neighborhoods/boroughs. I'm going to try to figure that out tonight when I have more free time today. Any suggestions would be much much appreciated still though.
Also @wwyws0000, no suggestions at the moment, what you did today was great!
Hey guys, I got some tiles up on the map, but i don't really understand how it works. If someone finds an easy way to add tiles on top of the map, please commit it to the master branch. I'll keep researching too.
Thanks Adam. I'm a bit occupied this next couple of days but will try to figure something out by the end of week.
Hi guys, I just disscussed with Luyue and we coded a few draft codes for restaurant recommandations. I have commit it out but I think some adjustment must be made before it can work. Generally, our code is about showing dot of restaurant on the map and also recommand some restaurant according to the user's input about zipcode, cuisine type and so on. I am now think about how to make checkbox as a filter. We will update our code after fixing this problem . By the way, is everyone Okay about having a short meeting after class to discuss our next step? Or if that time doesn't work, we can pick another time. Thank you guys for reading.
That sounds great! I am sick since yesterday so I won't be in class today. If you guys end up meeting just let me know what I need to do.
Hi guys, Luyue and I just finished the code of restaurant recommandation. I just commited the new version of our code. In our APP, input cuisine type and grade letter can desplay the dot of matched restaurant on the map and there will be a list of matched restaurant above the map. I think that what we can do so far. @wwyws0000 Hope you get well soon!
Hi, I'm still stuck trying to add colored tiles up on the map. I decided to use the code that was given to us in the app folder to make a choroplethr map, but I don't know what its outputting. Can someone help me figure this out?
I'm also stuck trying to group all the zip codes in 1 column and then adding all the values up in the second Column. For example, if Restaurant A is in zip code 10000 with a violation score of 10 and restaurant B is in zip code 10000 with a violation score of 30, the output should be a chart that has a zip code column and a score violation column that says that zipcode 10000 has a score of 40.
Ignore what I just sent I think I figured it out
Hey guys, I finished my part, I'm going to commit it in the morning. Also as a side note, did our professor tell us that we had to put our code on some website? I remember him telling us something like that but i cant find where in the project 2 notes where he talked about it.
Hey guys I added Na and Luyue code to wenyue doc. 3 things:
Thanks Adam. I will take a look at it tomorrow!
Adam- couple suggestions on your code:
Taking a simple average of scores using group_by(ZIPCODE) %>% summarise(mean(SCORE)) isn't an accurate figure for average score IMO. This is because each restaurant can have more than 1 line of entry in the dataset. For example, if restaurant A had 5 violations on an inspection conducted on date d1, then it would have 5 entries with the same scores. If it had another inspection on another date, that would add additional entries for the same restaurant. This significantly distorts the simple average score. So we want to ask ourselves this question: what do we want our average score to really show?/what is the most meaningful and useful metric to show? My answer would be to show the average score of only the latest inspection for each restaurant (CAMIS).
The map seems a bit too small on the interface. I feel it should take over the majority of the screen. Perhaps the easiest to do this is to separate the map and the table into 2 sub-tabs
I'm also not entire sure that the violation score is just a number for how many violations a restaurant committed. So the label needs to be changed. Also, there's no need to show 20 decimal places in the table.
is there a way to increase the transparency of the colored tiles or increase visibility of the area labels under the tiles? When I look at the colored map it's very difficult to see area names under the color.
all the .nbhd variables you initiated doesn't seem to be used at all in the code. If that's the case we should get rid of them.
Na & Luyue:
I also feel it might be best to increase the area of the map.
The table doesn't need to contain all the original information in the raw file. Some simple combinations of Name, address, cuisine, score, grade, violation, etc should be enough for the user and not appear messy.
For the map, you should filter out so that only one instance of each restaurant(CAMIS) exist in your data frame first. Like I said, the original data contains many rows for each restaurant. Even though they all share the same long/lat, it may put extra clutter on the map and reduce performance. Also, you may want to clean the longitude and latitude data (getting rid of invalid values).
Piggyback on what Adam said earlier, your data import, i.e.
df<- read.csv("C:/Users/ajkra/OneDrive/Documents/GitHub/fall2019-proj2--sec1- grp6/data/DOHMH_New_York_City_Restaurant_Inspection_Results.csv")
refers to your own local folder so the code doesn't work. But since you're simply reading the original raw file, you can just set your df to the variable "data_raw", which is what I loaded in my code and it's already taken care of the path issues.
data_raw_1 <- fread('../data/Raw Data 1.csv') data_raw_2 <- fread('../data/Raw Data 2.csv') data_raw <- rbind(data_raw_1,data_raw_2)
@wwyws0000 I followed your suggestions and fixed the code. As I said before I'm not in town and I had to fix the code on a borrowed computer. Sorry but I dont know if I can do much more since I have limited computer access. Hopefully I fixed all the probelms. See you guys this weekend. Also I think we need to add the code to some website, can someone do that?
Hi guys, I am trying to public our app, but it shows "Disconnected from the server" at shinyapps.io Do you know how to solve the problem?
I am trying to public our code into shinyapps.io, and write the summary at Github. And I am thinking about how to make a presentation this weekend. Could we meet on Mon or Tue to discuss about the final output?
On Sun, Oct 6, 2019 at 11:42 PM akrav notifications@github.com wrote:
@wwyws0000 https://github.com/wwyws0000 I followed your suggestions and fixed the code. As I said before I'm not in town and I had to fix the code on a borrowed computer. Sorry but I dont know if I can do much more since I have limited computer access. Hopefully I fixed all the probelms. See you guys this weekend. Also I think we need to add the code to some website, can someone do that?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TZstatsADS/fall2019-proj2--sec1-grp6/issues/1?email_source=notifications&email_token=ALMDDWZHJ23ZSNZXR2TKR33QNKV3DA5CNFSM4I2RZNJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAO6AKA#issuecomment-538828840, or mute the thread https://github.com/notifications/unsubscribe-auth/ALMDDW5KJW35XJRWM5P4AM3QNKV3DANCNFSM4I2RZNJQ .
-- Chongyu He Columbia University M.S. in Industrial Engineering and Operation Research https://www.linkedin.com/in/chongyu-he-226501156/ https://www.linkedin.com/in/chongyu-he-226501156/ 646-229-0346 ch3379@columbia.edu ch3379@columbia.edu
Adam -
The filter you applied to get unique restaurant is using "RECORD DATE", for which there's only 1 - the date we the data is pulled. So it does nothing (the filtered data has the exact same number of rows as the unfiltered). The correct date filter is "INSPECTION DATE".
I'll fix this myself.
Thanks, sorry about that. When looking up for inspection date I only saw inspection type so I got confused and thought that the record date had to be the right data to filter by.
No problem. Also, can you point me towards where you got the neighborhood data from (the .nbhd variables that were not in use) ? I think we might be able to find use for it after all.
I have published some of the app, and the link is https://lovelydoggy.shinyapps.io/Proj2_pre/. But the issue is that when I click "Inspection Grade", it would sometimes disconnect from the server. And I would find out how to solve it.
@wwyws0000 I got the data from https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm and I just hard coded each neighborhood name making sure it was in the same relative position as its corresponding zip code.
Hi all,
I made the final edits, including adding additional feature improvements and updating the summary/contribution statements.
@lovelydogggy Please submit this to the shiny server. You may need to install another package (shinyWidgets) for the code to work.
Is everyone good with meeting 30 min before class to discuss and go over presentation?(around 5:40 pm)
Wenyue
@wwyws0000 Thank you for your final edition! I am okay to meet 30 min before class . See you guys then.
sounds good with me.
The website is https://lovelydoggy.shinyapps.io/Final_presentation/.
Hi all, just posting a couple datasets that I thought may be interesting to analyze for our project:
DOHMH New York City Restaurant Inspection Results (https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j) A large dataset. Analysis can be done from many angles such as type of violations, type of restaurants/cuisines, location of restaurants.
2013 - 2015 New York State Mathematics Exam by School (https://data.cityofnewyork.us/Education/2013-2015-New-York-State-Mathematics-Exam-by-Schoo/gcvr-n8qw) Can be analyze base on school, grade, race, sex, etc...
Feel free to add any other topics you found interesting.