Closed damonmcc closed 9 months ago
As a follow-up to the coding caucus meeting on 11/30, the GIS team has received a few internal and external questions about zoning district changes over time. It would be interesting to do a pilot project (perhaps for 2 points in time) to analyze zoning code and area changes. Damon gave an example how cells can be tracked in space in biology - can research if we can use a similar approach.
@mbh329 for any project ideas
team decided not to submit a proposal due to limited bandwidth and plans for internal data science work
I have to switch my notifications so that I actually get notified when you tag me!
due 12/11 or 12/8?
Columbia Data Science M.S. capstone projects
some favorite previous projects
from submission form
PROJECT DESCRIPTION Tell us more about your project.
Project Title
MOTIVATION, BACKGROUND AND OVERVIEW: Please state briefly what is the problem that the project tackles. The projects need to be focused on a data science problem that is engaging, relevant, clearly defined and of the right scope for a semester. When assessing the proposals we will be looking for a diverse set of problems that address different topics and technical requirements that our students can address. The evaluation criteria will include: Is this a data-science project? Can our students learn about a data science application in the real world? Is the proposed research problem important and can potentially have a big impact? Will our students be excited about it? Please provide your project description having these criteria in mind.
DATASETS The dataset(s) can be public or private. Please keep in mind that the students will need to list the project on their CV and the report will be public.
Please provide a sample of your dataset with at least 10 rows of tabular data/images/ and/or meta data.
If you’re using private data, have you confirmed with relevant stakeholders (e.g. legal, compliance, communications) that the data can be used for this project? (If the answer is No, please contact Jessica Rodriguez at jr3056@columbia.edu when you submit this form).
DATASET: Please provide a detailed description of the type of data that is required to address the problem. For example, is this social media data, medical data, financial data, etc? What is the size of the data. Will the organization provide the majority of the data or is the data accessible via other avenues/ sources? How much of the data is available? Do the students need to gather data? In assessing the projects, the availability and type of data will play an important role. Please consider these evaluation criteria for data requirements when submitting the proposal: Is the data set clearly defined? Is the data set complex and big enough for creating learning opportunities? Is the data set ready? (availability, need for processing) Does the data require extensive computing resources (if yes, can the affiliates provide resource/funding?)
DATA TYPE: Public data is data made available by a third party and is available to the general public. Novel data is data that has been recently published by the proposer or will be made public as part of this project. Private data is data that cannot be made available after the project ended. Please check all that apply. Uses Public Data Uses Private Data Uses Novel Data
HOW WILL THE DATASET BE MADE AVAILABLE? For example: CSV/XLS file, remote database, raw images or documents, REST endpoint, etc.
Type of Data Graphs, Networks Text Data Audio/Image/Video Geospatial Time Series
Work Requirements (Check all that apply) R Scraping (including API) Database (e.g. SQL) Preprocessing Visualization App/tool building
GOALS, OUTCOME and SKILLS Research Goals
Project Topic Social Good Biomedical Physical Sciences (chemistry, climate, etc.) Consumer Social Media Finance and Economics
Data Science Areas in this project? Statistics Casual Inference Deep Learning Reinforcement Learning Algorithms
Expected Outcome? Model Report Paper Software Other
SKILLS: What skills should students expect to learn through their project? Check all that apply. Project planning and scoping Data acquisition and scraping Data versioning and management Data cleaning Combining data sources Exploratory data analysis and visualization Supervised modeling Unsupervised modeling Establishing evaluation metrics Working with text data Working with image data Working with time series data Working with tabular data Working with geospatial data
What is the goal of this project? What questions do you want answered? What has been done already to achieve this goal?
What are the ethical considerations?
Are there any ethical concerns about the proposed project such as privacy, transparency, and bias that we should pay special attention to?
What is the relevant background needed for the project? In order to make sure we build the right team of students for each project, please provide information on the relevant background information that someone working on the project should have. What technical skills they should have and/or relevant literature (please provide citations) or tools (please provide links) they will need to know or be able to learn.
What are the quantitative and/or qualitative metrics that can be used to judge the successful completion of the capstone project?
Are international students on a F1 or J1 student visa eligible to work on this project?
Are you willing and/or able to work with students who are currently physically in another country (if time zone is not an issue?)
Are you willing to work with two teams of students?
from initial email
The Capstone Project course provides a unique opportunity for students in the M.S. in Data Science program to apply their knowledge of the foundations, theory, and methods of data science to address data-driven problems in the industry, government, the nonprofit sector or academia. Course activities focus on semester-long projects sponsored by our Industry Affiliates, NYC or an academic research lab. Project synthesizes the statistical, computational, engineering and social challenges involved in solving complex real-world problems. Typically, four or five students work together as a team on each project. Each team is supervised by a faculty mentor and/or an industry mentor and projects typically progress through the following phases: