eugeneyan / eugeneyan-comments

1 stars 1 forks source link

https://eugeneyan.com/writing/what-does-a-data-scientist-really-do/ #23

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

What does a Data Scientist really do?

No, you don't need a PhD or 10+ years of experience.

https://eugeneyan.com/writing/what-does-a-data-scientist-really-do/

matrix21 commented 3 years ago

Absolutely spot on eugene. It would be really helpful if you can tell us more about these aspects 1) Building frameworks (e.g., validation) and pipelines 2) Running experiments, monitoring, and analysing 3) Putting the data product into production

How can one learn and apply these in a project. Any good end to end example which showcase these 3 steps .

PS : Really like your website and articles

eugeneyan commented 3 years ago

Wow, that's a difficult question. Here's my humble attempt at planning such a project that covers those three aspects:

Problem statement

Building frameworks (e.g., validation) and pipelines

Running experiments, monitoring, and analysing

Putting the data product into production

Thank you for your kind words!

matrix21 commented 3 years ago

Thanks a lot Eugene. This definitely gives a direction and a relatable example of the steps.

Will try to incorporate this

On Sat, Sep 26, 2020, 8:59 AM Eugene Yan notifications@github.com wrote:

Wow, that's a difficult question. Here's my humble attempt at planning such a project that covers those three aspects:

Problem statement

  • Given the historical price and Twitter data, can we predict next day's stock price?

Building frameworks (e.g., validation) and pipelines

  • Data acquisition pipeline (e.g., yahoo finance and tweets on specific tickers)
  • Monitor frequency of tweets and yahoo finance data; notify if long period without data
  • Validate correctness of the data format (though admittedly, yahoo finance and twitter data is pretty clean; perhaps check for emoticons or non-ASCII characters)

Running experiments, monitoring, and analysing

  • Predict tweet sentiment to aggregate public sentiment on stock ticker
  • Predict next day's price based on historical price and trending tweet sentiment
  • Monitor model performance of next-day stock price prediction
  • Error analysis on largest errors

Putting the data product into production

  • Online dashboard with daily update
  • Visualize tweet and predicted sentiment
  • Visualize historical price, predicted price, actual price

Thank you for your kind words!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eugeneyan/eugeneyan-comments/issues/23#issuecomment-699305120, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3XXPLTRDZNCBRG4LFZ653SHVN2XANCNFSM4RVLBD5Q .

chineidu commented 2 years ago

Excellent article!