numfocus / YouTubeVideoTimestamps

Adding timestamps to NumFOCUS and PyData YouTube videos!
https://www.youtube.com/c/PyDataTV
MIT License
77 stars 19 forks source link

Itamar Turner-Trauring - Speed up Python data processing with vectorization | PyData Global 2022 #162

Open emmcauley opened 1 year ago

emmcauley commented 1 year ago

URL: https://www.youtube.com/watch?v=OUFe8JqPqTU&list=WL&index=1

Contents

0:06 Presentation Introduction 0:18 Python Performance Paradox: a popular but slow language 1:22 Vectorization is a solution 2:03 Vectorization can be 100x faster (example) 4:38 Remaining presentation agenda 5:05 Why Python is slow: 3 reasons 5:12 Conversion into PyLong objects 7:41 Conversion between PyLong and Int objects 8:41 Greater overhead for seeking 10:52 Summary of why Python is slow 11:24 Vectorization supports fast operations 11:42 NumPy arrays are a better memory layout for numeric data 13:06 Why operations are faster on NumPy arrays (example) 14:367 Using built-in Python methods is still slow, even when input is a NumPy array 16:32 Vectorized APIs: bulk and fast 18:19 How to write fast code: 3 ways 18:29 Avoid built-in Python objects 19:44 Avoid iterating in Python 20:26 Avoid calling back into Python 22:41 Summary of how to write fast code in Python 23:17 Thank you! 24:42 Q&A — How to balance use of SQL and Python in ETL pipelines? 27:06 What are suggested alternatives to list comprehensions?