suredream / hack-your-ds-interview

通过社群学习的方式来 review 有代表性的 notebook 和 ML 项目,实战 DS interview
MIT License
6 stars 4 forks source link

session 003 - LSTM IMDB Sentiment Example Review from Colab #21

Open John-Cai-ds opened 2 years ago

John-Cai-ds commented 2 years ago

Why: A time series is a series of data points in time order, which is a sequence with equally spaced points in time. Examples of time series are price of tickets, price of hotel room, daily stock price, etc. It has been widely used in pattern recognition, weather forecasting, earthquake predition, and in any science and enginering which are related to temporal measurements. Date: 12/04/2021 TBD Skype: https://join.skype.com/WzRQJuTDFrMe host: @John-Cai-ds facilitator: @stanghong https://app.reviewnb.com/suredream/hack-your-ds-interview/blob/main/notebook%2FLSTM_IMDB_Sentiment_Example.ipynb

MeihZ commented 2 years ago

非常感谢分享,希望下周可以深入讨论一下: 1) LSTM的应用领域(主要解决什么问题,什么样的data适用) 2.) 实际应用中与其他model的对比 3)How to tune the model? Metrics? 4) in general , RNN application

MeihZ commented 2 years ago

概述一下目前time series 的model 选择和对比

stanghong commented 2 years ago

Summary LSTM比传统classification方法效果要好,但是对计算资源要求较高 处理中TFIDF+RF/XGBOOST有可解释性,bert/training size大,embedding好些,distilbert更会好些 NLP在医疗,保险中的应用很有前景(参考视频录影),LSTM在无人驾驶,voice control中有应用前景 Discrete timesteps, vanishing的问题LSTIM会很好的解决

Questions 1) LSTM的应用领域(主要解决什么问题,什么样的data适用) 2) 实际应用中与其他model的对比 3)How to tune the model? Metrics? 4) in general , RNN application 5) NLP model drifting问题:比如twitter出现新的词汇在traindata里不存在如何处理? 6) 问题notebook里50/50的split,和其他算法不一样,有没有什么样的split guidance? 7) Stopping word处理有没有什么讲究?