TZstatsADS / fall2019-proj2--sec1-grp6

fall2019-proj2--sec1-grp6 created by GitHub Classroom
0 stars 0 forks source link

Project Data Discussion #1

Open wwyws0000 opened 4 years ago

wwyws0000 commented 4 years ago

Hi all, just posting a couple datasets that I thought may be interesting to analyze for our project:

  1. DOHMH New York City Restaurant Inspection Results (https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j) A large dataset. Analysis can be done from many angles such as type of violations, type of restaurants/cuisines, location of restaurants.

  2. 2013 - 2015 New York State Mathematics Exam by School (https://data.cityofnewyork.us/Education/2013-2015-New-York-State-Mathematics-Exam-by-Schoo/gcvr-n8qw) Can be analyze base on school, grade, race, sex, etc...

Feel free to add any other topics you found interesting.

akrav commented 4 years ago

Topic ideas: Purpose : (Find an area for food and explore (not like yelp where you just find a restaurant and go))

  1. Borough/ Neighborhood most common cuisine

  2. Zipcodes with the lowest score voilations

  3. slider bar for highest average violation per zipcode/Borough by cuisine type

  4. slider bar for lowest average violation per zipcode/Borough by cuisine type

  5. Select search by cuisine and use radio buttons to search by grade and slider for what percentage of a zipcode/borough has those grades or higher, hover over map to show number of restaurants and can pinpoint restaurant locations in those neighborhoods

wwyws0000 commented 4 years ago

Thanks Adam, not sure our purpose should be that broad, since the dataset is explicitly on food safety concerns, whereas food exploration depends on many other variables. So I feel that unless we are incorporating another dataset with ratings, etc, we should focus purely on the food safety aspect

  1. This is good if we're sure that the dataset is intended to include all the restaurants in existence, not just those with certain violations (otherwise the information is quite biased)

  2. Perhaps we can show this via the heatmap?

    3,4,5: I find these a bit difficult to understand. Could you elaborate?

I've made a new commit where I started a sample app that illustrates my thoughts on the structure of our project/app. The ui and server code is under "Wenyue doc". The structure is as follows:

  1. Overview tab. I've already done a partial implementation. It has two sub-tabs. On the "Top Violations" subtab, the user can see a table and barchart that shows the top X types of violations, filtered by Cuisine Type, Borough, and Severity. On the "Inspection Score" subtab, the user can see a distribution/histogram of the inspection score, filtered by the same criterias .

  2. Map Tab. Here we can implement ideas and visualizations based on maps. For now i can think of stuff like heatmap of average inspection score by zipcode, etc.

  3. Restaurant Lookup. Here we can show some more detailed information on a single (or maybe multiple?) restaurants, such as address, cuisine type, historical inspection issues, most recent inspection grade, etc.

akrav commented 4 years ago

Hi Wenyue,

For 3 and 4, I was thinking that grouping by zipcode we can see which cuisines are known to have the most and least violations (a heatmap maybe). The slider bar I don't remember what I was thinking to pitch that idea.

5 was similar but I was thinking like a restaurant search. As an example, looking for zipcodes where 60% of Chinese restaurants have Grade B or higher.

Question: Who do you think the app is for? Based on what you submitted, is the app for inspectors? Would they use the app to make it easier to finding restaurants that might have violations ? Or is it for everyday people?

lovelydogggy commented 4 years ago

Hi guys, sorry that I have interviews these days. I would start my work on weekend.

wwyws0000 commented 4 years ago

Hi Adam,

For 3 and 4 that's what im thinking as well. Heatmap of average scores by zipcode. Or we can do a dot overlay over the map where each dot represent a restaurant, and the color/size of the dot changes according to their score. Either way should be good.

IMO the app is for anyone anyone who wants to know inspection results for a specific restaurant or groups of restaurant. Could be an individual who just wants to see if the restaurant they're heading to is graded well, or it could be an inspector who wants to see some trends/analysis.

clycrystal commented 4 years ago

From the practical point of view, most of the app's users are those who wanna choose a perfect restaurant to have a dine, so basically they would consider the following factors:1. cuisine type; 2. location; 3. food menu, reviews, price; 4. violations (food safety, hygiene). And they would rank each factor based on personal preference.

So as for the dataset Wenyue provided, to better sort out the violations info, my suggestions are:

  1. Present an overview outlook: a. the percentage of each Grade category filtered by "cuisine type", "zip code" or both; b. Top or Grade A restaurants filtered by "cuisine type", "zip code" or both; c. the percentage of "cuisine type " filtered by "Grade", "zip code" or both.

  2. Provide more details into the specific restaurant: a. as Wenyue says above, mark all the restaurant on the map, and the dot size or color corresponds to the Grade. This can also be filtered by "cuisine type", "borough"/"zip code", and the map only shows dots that meet the set conditions. b. If users are interested in a particular restaurant, they can either type the name on search bar or click it on the map, and the more detailed info about that restaurant will present in a table incl. street, phone, cuisine, all violation code/ description, grade, etc. ( I notice that each restaurant may be cited for many times, so maybe when we present the detailed info of a restaurant, we can calculate the average score and then assign it a grade.)

As for the dataset Na provided, we can either merge the two datasets by index(restaurant maybe? but it seems not that easy) or design a separate sector for this.

This is my idea right now, hope you guys can combine my suggestions into the outline, once then design the overall layout based on that outline. Thanks for reading and have a great weekend!!!~

wwyws0000 commented 4 years ago

Crystal - did Na provide a dataset somewhere? I think I missed it.

akrav commented 4 years ago

In the email chain we have Na posted this dataset: [(https://data.cityofnewyork.us/Health/DOHMH-MenuStat/qgc5-ecnb)]. Also she said she was working on the app in the email but i don't think she committed it yet.

clycrystal commented 4 years ago

yup, it's about the menu and food nutrients.

wwyws0000 commented 4 years ago

Oh okay I found it now. The only problem I see is that the data only has information on nutrition facts of the menu items. I'm not sure if that's enough to make our app a "restaurant recommendation" app because

clycrystal commented 4 years ago

Yes, you are right, so I was wondering whether we could use index like the restaurant's name to match some info from the first dataset, and use this dataset to create another page mainly focusing on nutrients.

In this way, we could set our app's theme as food safety and nutrition facts, which could be categorized as a whole topic as "Healthy diet", and thus target our users as those who care much about the healthy diet and wanna make sure the restaurant they choose is safe and healthy. And I think that'll let our app's content be more rich than just telling the violation story.

nazhuo commented 4 years ago

Hi guys, Sorry for the late reply, I didn't know we were discussing on Github. After tryed to map the restaurant and also merge the two dataset by name, I found that the menu dataset are of some chain stores like 7-eleven. I apopogize for my misleading and wasting of our time. I am sincerely sorry for the wasting of your time. After searching on NYC OpenData for several keywords about restaurant, I don't think there are other data avaliable. I think we can focus on the data provided by Wenyue.

wwyws0000 commented 4 years ago

Not a problem Na. I couldn't find any other restaurant datasets either so I'll only focus on the violations.

I just committed my updated code where I enhanced/completed some of the features I mentioned last time. Mainly, I changed the Overview tab to Comparison where users can compare top violations, inspection scores, and grade breakdowns using two set of filters.

Top Violations: image

Inspection Score: image

Grade Breakdown: image

Let me know if you guys have any suggestions!

akrav commented 4 years ago

Hey guys, I added a leaflet map to the map tab. I'm still trying to figure out how you add boundaries around certain zipcodes/neighborhoods/boroughs. I'm going to try to figure that out tonight when I have more free time today. Any suggestions would be much much appreciated still though.

Also @wwyws0000, no suggestions at the moment, what you did today was great!

akrav commented 4 years ago

Hey guys, I got some tiles up on the map, but i don't really understand how it works. If someone finds an easy way to add tiles on top of the map, please commit it to the master branch. I'll keep researching too.

wwyws0000 commented 4 years ago

Thanks Adam. I'm a bit occupied this next couple of days but will try to figure something out by the end of week.

nazhuo commented 4 years ago

Hi guys, I just disscussed with Luyue and we coded a few draft codes for restaurant recommandations. I have commit it out but I think some adjustment must be made before it can work. Generally, our code is about showing dot of restaurant on the map and also recommand some restaurant according to the user's input about zipcode, cuisine type and so on. I am now think about how to make checkbox as a filter. We will update our code after fixing this problem . By the way, is everyone Okay about having a short meeting after class to discuss our next step? Or if that time doesn't work, we can pick another time. Thank you guys for reading.

wwyws0000 commented 4 years ago

That sounds great! I am sick since yesterday so I won't be in class today. If you guys end up meeting just let me know what I need to do.

nazhuo commented 4 years ago

Hi guys, Luyue and I just finished the code of restaurant recommandation. I just commited the new version of our code. In our APP, input cuisine type and grade letter can desplay the dot of matched restaurant on the map and there will be a list of matched restaurant above the map. I think that what we can do so far. @wwyws0000 Hope you get well soon!

akrav commented 4 years ago

Hi, I'm still stuck trying to add colored tiles up on the map. I decided to use the code that was given to us in the app folder to make a choroplethr map, but I don't know what its outputting. Can someone help me figure this out?

I'm also stuck trying to group all the zip codes in 1 column and then adding all the values up in the second Column. For example, if Restaurant A is in zip code 10000 with a violation score of 10 and restaurant B is in zip code 10000 with a violation score of 30, the output should be a chart that has a zip code column and a score violation column that says that zipcode 10000 has a score of 40.

akrav commented 4 years ago

Ignore what I just sent I think I figured it out

akrav commented 4 years ago

Hey guys, I finished my part, I'm going to commit it in the morning. Also as a side note, did our professor tell us that we had to put our code on some website? I remember him telling us something like that but i cant find where in the project 2 notes where he talked about it.

akrav commented 4 years ago

Hey guys I added Na and Luyue code to wenyue doc. 3 things:

  1. We still need to add the data that Na and Luyue used in their code.
  2. We need to read Na and Luyue data from the data folder not from a local desktop
  3. I'm going to be out of town until Wednesday, so I'm not going be able to do that much more work on this project until the presentation (sorry!). See you guys next week.
wwyws0000 commented 4 years ago

Thanks Adam. I will take a look at it tomorrow!

wwyws0000 commented 4 years ago

Adam- couple suggestions on your code:

Na & Luyue:

akrav commented 4 years ago

@wwyws0000 I followed your suggestions and fixed the code. As I said before I'm not in town and I had to fix the code on a borrowed computer. Sorry but I dont know if I can do much more since I have limited computer access. Hopefully I fixed all the probelms. See you guys this weekend. Also I think we need to add the code to some website, can someone do that?

lovelydogggy commented 4 years ago

Hi guys, I am trying to public our app, but it shows "Disconnected from the server" at shinyapps.io Do you know how to solve the problem? 161570429220_ pic_hd

lovelydogggy commented 4 years ago

I am trying to public our code into shinyapps.io, and write the summary at Github. And I am thinking about how to make a presentation this weekend. Could we meet on Mon or Tue to discuss about the final output?

On Sun, Oct 6, 2019 at 11:42 PM akrav notifications@github.com wrote:

@wwyws0000 https://github.com/wwyws0000 I followed your suggestions and fixed the code. As I said before I'm not in town and I had to fix the code on a borrowed computer. Sorry but I dont know if I can do much more since I have limited computer access. Hopefully I fixed all the probelms. See you guys this weekend. Also I think we need to add the code to some website, can someone do that?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TZstatsADS/fall2019-proj2--sec1-grp6/issues/1?email_source=notifications&email_token=ALMDDWZHJ23ZSNZXR2TKR33QNKV3DA5CNFSM4I2RZNJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAO6AKA#issuecomment-538828840, or mute the thread https://github.com/notifications/unsubscribe-auth/ALMDDW5KJW35XJRWM5P4AM3QNKV3DANCNFSM4I2RZNJQ .

-- Chongyu He Columbia University M.S. in Industrial Engineering and Operation Research https://www.linkedin.com/in/chongyu-he-226501156/ https://www.linkedin.com/in/chongyu-he-226501156/ 646-229-0346 ch3379@columbia.edu ch3379@columbia.edu

wwyws0000 commented 4 years ago

Adam -

The filter you applied to get unique restaurant is using "RECORD DATE", for which there's only 1 - the date we the data is pulled. So it does nothing (the filtered data has the exact same number of rows as the unfiltered). The correct date filter is "INSPECTION DATE".

I'll fix this myself.

akrav commented 4 years ago

Thanks, sorry about that. When looking up for inspection date I only saw inspection type so I got confused and thought that the record date had to be the right data to filter by.

wwyws0000 commented 4 years ago

No problem. Also, can you point me towards where you got the neighborhood data from (the .nbhd variables that were not in use) ? I think we might be able to find use for it after all.

lovelydogggy commented 4 years ago

I have published some of the app, and the link is https://lovelydoggy.shinyapps.io/Proj2_pre/. But the issue is that when I click "Inspection Grade", it would sometimes disconnect from the server. And I would find out how to solve it.

akrav commented 4 years ago

@wwyws0000 I got the data from https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm and I just hard coded each neighborhood name making sure it was in the same relative position as its corresponding zip code.

wwyws0000 commented 4 years ago

Hi all,

I made the final edits, including adding additional feature improvements and updating the summary/contribution statements.

@lovelydogggy Please submit this to the shiny server. You may need to install another package (shinyWidgets) for the code to work.

Is everyone good with meeting 30 min before class to discuss and go over presentation?(around 5:40 pm)

Wenyue

nazhuo commented 4 years ago

@wwyws0000 Thank you for your final edition! I am okay to meet 30 min before class . See you guys then.

akrav commented 4 years ago

sounds good with me.

lovelydogggy commented 4 years ago

The website is https://lovelydoggy.shinyapps.io/Final_presentation/.