issues
search
peacfuljoh
/
predictive-analytics-ytvideos
Full-stack real-time predictive anaytics for YouTube content creators
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Address data/model considerations
#37
peacfuljoh
closed
7 months ago
1
Add explanation of system design, how ETL pipeline configs work, MLOps functionality, etc.
#36
peacfuljoh
opened
7 months ago
0
Clean up Github Actions failures across ytpa project repos
#35
peacfuljoh
closed
7 months ago
1
Verify that code uses DataFrame index properly everywhere with .iterrows()
#34
peacfuljoh
opened
7 months ago
0
Regularly auto-restart crawler
#33
peacfuljoh
opened
8 months ago
0
Set up Sphinx docs?
#32
peacfuljoh
closed
7 months ago
0
Port ETL pipelines to PySpark (maps onto AWS Glue)?
#31
peacfuljoh
closed
9 months ago
1
Fix memory leak issue in websocket streams (use low-bandwidth test setup to debug)
#30
peacfuljoh
closed
9 months ago
1
Move get_validated_etl_request() and other db credential-dependent methods to API (e.g. routes_utils())
#29
peacfuljoh
closed
9 months ago
1
Decouple ETL pipelines from databases via API endpoints (post to pull, post to push)
#28
peacfuljoh
closed
9 months ago
0
Fix bulk load + most recent timestamp slow-down in etl_load_vocab_from_db()
#27
peacfuljoh
closed
10 months ago
0
Set up db backups
#26
peacfuljoh
closed
10 months ago
0
Split off utils to own package
#25
peacfuljoh
closed
10 months ago
0
fix up macros (e.g. no hard-coding col names) and move ETL configs to JSON files
#24
peacfuljoh
closed
10 months ago
1
Gracefully handle crawler failure (loss of internet connection)
#23
peacfuljoh
closed
10 months ago
0
Add schema validation in MongoEngine (specify in engine config via schema file, validate inside all write ops)
#22
peacfuljoh
closed
10 months ago
0
Add tests for simple LR model components
#21
peacfuljoh
closed
10 months ago
1
Ensure macros for column names are fully utilized throughout repo
#20
peacfuljoh
closed
10 months ago
0
Implement performant regression model
#19
peacfuljoh
opened
11 months ago
1
Replace final return value of all generators with empty return statement
#18
peacfuljoh
closed
10 months ago
1
Implement unique sets search in MongoDB in one query instead of iterating
#17
peacfuljoh
opened
11 months ago
0
When loading vocab, add filter to choose which version (latest?)
#16
peacfuljoh
closed
11 months ago
2
Ensure all _id entries in mongodb collections are ObjectId strings
#15
peacfuljoh
closed
11 months ago
1
Modularize repo to mimic AWS architecture
#14
peacfuljoh
closed
9 months ago
0
Implement testing and logging
#13
peacfuljoh
closed
9 months ago
1
Replace full-dataframe queries with generators in featurization pipeline
#12
peacfuljoh
closed
11 months ago
1
Incorporate prediction timeframe into which URLS are chosen in stats spider
#11
peacfuljoh
closed
11 months ago
1
Perform resampling for all numerical time series
#10
peacfuljoh
closed
11 months ago
0
Implement logging for crawler components
#9
peacfuljoh
closed
11 months ago
1
Implement ETL pipeline for featurizing raw data
#8
peacfuljoh
closed
11 months ago
0
Update thumbnail URL to be long enough
#7
peacfuljoh
closed
1 year ago
1
Resolve bugs in stats crawler
#6
peacfuljoh
closed
12 months ago
1
Story 3: Migrate MySQL and MongoDB engines to their own PyPI packages
#5
peacfuljoh
closed
10 months ago
0
Miscellaneous next steps
#4
peacfuljoh
opened
1 year ago
1
Story 2: API
#3
peacfuljoh
closed
9 months ago
1
Move db schema to schema.sql file
#2
peacfuljoh
closed
1 year ago
0
Story 1: Crawler
#1
peacfuljoh
closed
11 months ago
0