preseries / GASP

The General Accepted Startup Principles project
MIT License
17 stars 6 forks source link

How we can get such detailed data of a startup #3

Closed bearnxx closed 5 years ago

bearnxx commented 5 years ago

I'm in a vc firm, which invest in early stage from seed to Series B. Normally, startups in early stage don't have such detailed data. If there are a lot of blank in the table, how could the model have a good performance?

fabdrnd commented 5 years ago

Hi! The idea of the GASP is to collect as many data points as possible on as many startups as possible. It is true that early-stage companies do not have such detailed data but the idea is to collect it over time (monthly) and look at the evolution of the metrics, including the ones being added over time.

The more data is collected on different startups, the better the performance of the model. It takes time and it's highly unlikely to cover all data points but predictive models based on decision trees are good at dealing with missing values.

bearnxx commented 5 years ago

Get it. So the predictive model sounds like a classifier?

fabdrnd commented 5 years ago

It's the most straightforward use case. At PreSeries we're using classifiers to predict, for example, the likelihood of certain scenarios such as IPO, Acquisition, Closure, etc.

We run our models on the PreSeries Analyst Platform where you have a wide range of predictive models to choose from (Decision trees, Ensembles, Logistic Regression, Neural Nets, Time Series, Clustering, Anomaly detection, Association Discovery, Topic Modeling, etc.). In practice, we combine different modeling & evaluation approaches.

bearnxx commented 5 years ago

Get it. Thank you.