sussmanbu / ma-4615-fa23-final-project-information-illumin8ors

ma-4615-fa23-final-project-information-illumin8ors created by GitHub Classroom
0 stars 0 forks source link

Project Sharing Feedback #15

Open dpmcsuss opened 10 months ago

dpmcsuss commented 10 months ago

Feedback from presentation by Arianit Balidemaj

1. Describe the main idea of the project

Unpacking the international drug survey, seeing income and locations inform drugs survey

Research on the relationship between drug use and other factors like races

Their project studies the usage of 6 most popular drugs plus alcohol among different races.

Cigarettes, substance abuse, against race and age

The main idea of the project was to study a drug survey for the 6 most used drugs in the US

Drug use between races with geography, age when they first tried drug, and income

drug use, 6 different main drugs, age, race differnce, logtisitc regression, cigarette binary output

The project explores the complex relationship between cigarette smoking, race, and mental health.

2. What was the best part of the teams work?

The topic! really cool work, I like the demographic work. A lot of the graphs are super cool, a well-represented study.

The finding of team 8 is that there is a difference between drug use among different racial groups. Team 8 used linear regression to find out the relationship between their variables, with great detailed procedures like correlation matrix and box-and-whisker plot to analyze the variance.

They mentioned their limitations on the survey data. It has representation problem: white people response more to the survey than any other races.

Good analysis, in depth, explanation and purpose for each decision. For example, the use of baselines for effective comparison.

I think their figures were very informational and well done. He mentioned they were trying to create a map which I think would be very interesting as well.

I really liked the graphs they presented because they were easy to understand and looked nice with the different colors and concise legend.

Different model to compare results, as Asian base line or other, white have more data’s, and race and age of marjuana box plot very visual,

The best part of the project is its visualization of a plot between races and their first age when encountering cigarette. The boxplot seems reasonable and it is decorated well.

3. How would you suggest improving the team's work?

Clarity figures, and describing how and why you chose to represent the data (ie why Hispanics are a baseline, what would happen if you switched the baselines for the same analysis?)

Could be better to provide more graphs for better presentation as it is easier for the readers/audience to grasp the result of the research results and see the relationship.

The team did a great job on everything. They can talk more about some possible reasons behind the findings.

The team uses raw numbers rather than proportions for the race variable, which could skew their analysis

I would suggest using the proportion of each race instead of the raw counts to help reduce skew in data from number of each race who responded

I think that the teams work was really good and I think that they could include more of the geography portion as they mentioned it but didn't really have it.

Since the data has more white, maybe spilt white data and analiyze it as its own model, for example white people marjuna age use by region?

Set up a clear metric of logistic regression such as confusion matrix or a cost and benefit matrix since it is a classification problem.

4. Do you have any other comments or ideas?

Overall great

The modeling of this project still requires effort.

Feedback from presentation by Wenting Chen

1. Describe the main idea of the project

The main topic of the project is the drug usage. They look at variables like race, income, urban/rural areas.

2. What was the best part of the teams work?

I like the linear regression model they made, which helps addressing the dataset limitations, like the income ranges.

3. How would you suggest improving the team's work?

I would suggest the team to make a demography plot based on the variables and main topic they are analyzing.

Feedback from presentation by Eleanor Paul

1. Describe the main idea of the project

Focused on six substances, what variables affect substance use (i.e. age, race)

Analyzing the impact of Race and Wealth on Drug Use (2021)

2. What was the best part of the teams work?

Strong awareness of any limitations of the dataset (and how to combat them), good start on logistic model and good idea for linear model

3. How would you suggest improving the team's work?

Merging data sets seems to not have worked out (but the idea of making a large map of the US seems to combat that problem)

The group has established a working thesis, but they haven't completed the data analysis to properly back their thesis, so just finishing that would be the only suggestion

Feedback from presentation by Xiaojing Yang

1. Describe the main idea of the project

Group 8 looked at the relationship between drug addiction and different demographics (race, gender, socioeconomic status)

Analyze the drug use with different drug use and the relationship with race, income, medical condition and demographic.

The drug use in different people with differnet politic opinion with different drug types.

Relationship of specific drug use with different races, medical conditions and cities

Analysis of the relationship between the pattern of drug use and race.

2. What was the best part of the teams work?

Presenter has a clear understanding of the team's work, dataset, and analysis. Effectively cleaned, loaded, and analyzed data despite difficulties using PDF as source of data

The group has demonstrated commendable proficiency in the challenging task of data cleaning, particularly given the complexity of working with a PDF dataset. Managing data in PDF format can be inherently challenging due to the unstructured nature of the content, making it commendable that the group has navigated through these difficulties effectively.

They get data from all different state, and analysis differnt drug and they also talk about how many times they use

They draw the plot Cocaine by different race and box plot of alchoho by races and the outcome is that there is no huge difference between white and other races.

They spent huge amount of time for the data cleaning, because their data are scattered on the pdf file and they have to merged them into the single data set.

3. How would you suggest improving the team's work?

Graphs are effective in showcasing subject of team's analysis, but could be more aesthetically-pleasing and consistently scaled (graphs are skewed by population size).

The group has made substantial progress in their analysis; however, there appears to be an opportunity for further enhancement, particularly in the realm of the regression model. It might be beneficial for the group to consider additional work on refining and extending their regression model to extract deeper insights from the dataset.

They use all data sets, which also including Alcohol and tobacco as well, and that may cause some people has different opinion.

Apply multiple regression models to see which specific drug has the significance to build the relationship with other variables.

If you introduce the variable of state into your model to adjust the outcome (pattern of drug use), it will be insightful to take into account the policy against illegal drug in each state.

Feedback from presentation by Tongdan Shentu

1. Describe the main idea of the project

the predictive effect and relationship between race, region and drug use

The main idea of the project focuses on drug use and racial groups.

Relationship between the race and the drug uses, in addition to other variables.

focus on the relationship between job and drug use.

The group discusses the relationship between racial disparities and drug use. Then they further on talk about how geographical factors affect drug use. Then they stress how different income levels affect the drug use.

exploring age, race, city, income effect on drug use

2. What was the best part of the teams work?

rely on codebooks to clean and recode the data. Very detailed and thorough data visualization with thorough analysis on each figure. Choose the right model for the binary outcome.

The best part of the team's work is that they have clear goals and clear strategy to achieve these goals.

Good usage of the graphs and explanation of the graphs were good. Good approach to what to do next, and what are expected to come.

There is lots of visualizations and very diverse! the annotation for the graph is clear and concise so we can easily understand the variables.

They have a lot linear and generalized models to predict drug use. All three of their independent variables are very clear to understand and follow along.

the best part of the teams work is using many variables, using a logistic regression, and their findings are very interesting.

3. How would you suggest improving the team's work?

could be more specific about the statistical model results and model selection process. Since the group is researching on the region, could add a map component to the project.

I'll suggest the team to try zoom in their comparitive graphs. It's hard to see clearly. But overall the project is interesting.

Little more elaboration on what the graphs are representing would be nice, in my opinion. It was still good, overall.

Maybe clarifying more the trends and explanation for the results. future exploration required. Try to focus more on the main idea.

I think they have a lot codes shown, so they can possibly include more plots to visualize their work to the audience who are not familiar with programming languages.

i would suggest on making more graphs and plots to provide visual aids.

4. Do you have any other comments or ideas?

Good job.

Feedback from presentation by Not Reporter

1. Describe the main idea of the project

TO determine wether income and geographic location was associated iwth drug use

The project explores drug usage among lower and higher income individuals.

Drug use differences among certain groups and races.

It is about idea about climate change with weather in different location

tobacco use and other drugs impacts on different race and ages.

Race or income/ size of a county/mental health is a good predictor for substance use

Using a national drug use dataset from 2021, narrowed down to focusing on 6 substances, alcohol, tobacco, marijuana, crack, cocaine, and heroin. Wanted to see if these different substances would be used more by specific racial groups and groups of varying economic status. Had to take into account false reports from the survey. Some race categories had a much larger proportion of respondents so their findings would be proportional as well. Merging a dataset that maps metro and non-metro counties across the country in order to try and locate the survey respondents from their main dataset since it didn't provide much information about location.

tobacco, acohol, cocaine...6 sustances. Try to figure out the relation between income and use of cocaine.

2. What was the best part of the teams work?

Th team had a good thesis and had a well-thought approach to the problem. I like how thorough they were with their analysis,

The graphs were very clear and did a good job communicating the differences among groups who use drugs, such as high versus low income individuals.

Nice plots, easy to read and interpret and supports there analysis perfectly. And they found that certain ethnic groups are more likely to experiment with or use drugs, and this prevalence varies in urban/rural areas, certain ethnic groups are more likely to experiment with or use drugs, and this prevalence varies in urban/rural areas. Higher income is associated with more drug use, while lower income is linked to increased drug use.

One of the best part is they have small nice blog post with all the graph has good Visualization to help us to understand the relationship

I liked how they investigate through the dataset well, and using the observations, they cleaned the dataset to minimize the bias across the race variable.

They incoporate with the census data because their dataset lack the location data and they noticed the possible bias in their dataset...................

I think the best part of the teams work was to narrow down to different substances that have a good chance at providing meaningful findings because of the histories of the substances they are looking at. For example, as they mentioned, drugs like cocaine may be expected to be used among higher income respondents while drugs like crack and heroin would be expected to use by lower income respondents.

Use proportion of use, since the observators of different race groups. Use regression model to show the relation. Use maps to show which person come from which counties.

3. How would you suggest improving the team's work?

Improve the delivery of the results - code isn't that important, but discussion is very important. You don't need to show the load and clean script

I would narrow the thesis since there is a lot to cover and it might be a lot for the full project.

Maybe adding more predictors and try different models to enhance their study. And more explaination about model parameters(like how does the drug use chages when the predictors change) and assess the goodness of fit.

I think its better if they can have more in detail explanation about their idea and after I saw the code its pretty unorganized. I think make the code easy to read would be better

They don't have the specific locations since the data is from National Health. They would need to make a sub group for the locations

Maybe can do cross validation to compare different model performance for their variables on predicting substance use

I think that they might not have even needed to focus on products like alcohol, tobacco, and marijuana since their findings seem to not provide much meaningful insight, so I think if they were to focus on the use of crack, cocaine, and heroin, they would have a better chance of presenting data that may show significant findings that could go against or support the common stereotypes around those drugs today.

My suggestion is that maybe make more plots on each substances, since they try to incorprate 6 substances in their data set. The parts are pretty good, and I hope that they can successfully figure out their model.

4. Do you have any other comments or ideas?

Since the data set is the same as ours, except the year, I personally think that their choice of topic that analyze the relation of multiple substances uses and income level is better than what we have right now.

ArianitBalidemaj commented 10 months ago

Thank you