Closed banshee56 closed 10 months ago
We discussed possibly building an AI system that can curate and edit educational videos on a particular topic from other websites like YouTube or Tiktok to create a single video/video series on our app. The ideal AI system would have to figure out sections of interest in these videos, join pieces of content in a way that the final product would be comprehendible, and would have to make sufficient changes to the video for the generated content to fall within Fair Use laws. An alternative that seemed to work in this rather well-cited study was to use crowdsourcing for editing educational video content.
Insights for Learning at Scale | [background]
This paper discusses the effects of aggressive behavior in TikTok comment sections of educational content. This may provide valuable insights on how to monitor the community so that our users are able to maximize their learning gains on our platform. We can thus learn from this and perhaps foster a more nurturing community than one of our biggest would-be competitors, TikTok.
I was not able to get access to this paper but I think it might prove valuable in terms of having a comprehensive study on the educational possibilities of platforms like TikTok (and ours). This paper is also very recent, so it likely points towards critical, up-to-date sources that may end up being useful for our purposes.
In our design doc, we briefly talk about prefetching logic for our videos to make sure the content is delivered quickly. This paper discusses a method to do this utilizing deep learning techniques in unstable network conditions.
Another method that aims to solve the above issue. This research includes an open source discrete-event simulator which may be useful when we decide on how to proceed with prefetching.
Describes how AI-powered microlearning – breaking down information into smaller pieces – has been used for training contact center agents at all levels, from new hires to long-term employees. Gives us qualitative goals to meet - ideally our product should be flexible, cost-effective, targeted, and engaging/interactive. Might help us think about applications and narrow the scope of our product if necessary.
Provides examples of existing short-form video platforms with an educational focus, including EJ4, Google Primer, The Training Arcade, and Axonify. Some strategies they use to make their content engaging include animations, infographics, and chatbots.
Affirms the problem of decreasing attention spans. Research group at UC Irvine found that people averaged 150 seconds on one screen before switching in 2004, but by 2021, this declined to 47 seconds. While digital distractions are mostly to blame, this essay suggests that completely unplugging from technology is impractical, so we should focus on regulating screen time instead. Maybe this is something we might think about when developing our own product - how to engage users without taking away from their productivity.
Aspects of TikTok that distinguish the platform from its competitors. We might consider incorporating some of these in our own project:
How YouTube generates video recommendations with a neural network. Discusses feature engineering and explains how layers are organized within the network. Could help us design our ML model.
[Link]((https://www.netguru.com/blog/mvc-vs-mvvm-on-ios-differences-with-examples)
In addition to Apple's own documentation, this blog post might help us decide how we organize our app. It summarizes the differences between two possible architectures, MVC and MVVM (most commonly used for mobile apps). This article provides an additional example of how an MVVM application might be organized in practice, while this one provides some generic advice on how folders should be structured. Finally, this explains app management.
Netflix System Design
A white-boarded design to Netflix to ensure scalability and high availability to users. Includes basic design to some services and a sensible split to ensure matching with microservices on deployment.
Youtube System Design
A white-boarded design to Youtube that can be used to reference high for video streaming, transcoding, and storage methods. This article provides an extra piece in that it also helps explain messaging queues that may be necessary to asynchronously connect our services: video scraping, ML and storage.
Video on Demand on AWS
A few solutions propositions with Cloudformation for fully AWS based solution microservice solutions to out streaming service. The costs of adopting a fully AWS solution will be higher than creating our own deployed on an ec2. However, it still offers interesting insight on horizontal scaling advantages and general high level system design.
https://spotify.design/article/performance-cards-designing-with-empathy-and-meaning
I think this is an important dimension to keep in mind as we do our design work — that some of our target users may have atypical internet setups and/or devices, usage abilities, and accessibility needs. We do not have to accommodate every potential user, but we do not want to be a terrible UX for them either.
This can also be done down the line! In term 5 of the project (if we get there). It's not a must-have, it's a nice-to-have.
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf
The YouTube recommendation engine works with a few tricks. First, the feed is generated by gathering a subset of the millions of videos on YouTube, generating a thousand video subset loosely similar to the user's preferences. This subset is loosely generated and not calculated precisely so speed is optimized. With this smaller subset, the recommendation engine then prioritizes the videos showing up based on close matches to user's preferences.
To generate these candidates for recommendation, many features are trained on and generated. One of which is the video upload time. Since YouTube prioritizes freshness of videos to keep content fresh, there is a curve where videos uploaded 1 hour to 2 days has a higher emphasis on the upload time feature. This allows for more contents to pop up, avoiding video cycles.
Other things that help create the recommendation engine are whether videos have relationships with each other (episodes in a series), come up through similar searches, and how long the user watched similar videos.
https://arxiv.org/pdf/2209.07663.pdf
Monolith is TikTok's recommendation engine and it uses a "Collisionless Embedding Table" to help with machine learning. For one, it separates its features into sparse features and dense features. To properly store sparse features, they stored it in a hash table since the sparseness of the features allow for low collisions while being able to get features in O(1) time.
Furthermore, Monolith uses both batch training in the back end and online training. The model allows for the bulk of the training to happen in the backend while a part of it happens online so the updates to the model are happening as new data comes in. This prevents problems like ChatGPT where is it solely trained on existing data rather than continuous incoming data.
Meta's Instagram has a similar model to YouTube where they have a funnel structure for feed retrieval. In which at each stage, the algorithm for ranking becomes more complex in order to create a data funnel and find the best videos for recommendation. Specifically, one thing they have is the Two Tower Model - in which user data and item (video/images) is fed into two models, the similarities are found, and it generates the final result. This way, when a user is scrolling through specific videos, they're recommended on items similar to what they have been interested in and is similar to the video.
https://dl.acm.org/doi/pdf/10.1145/3386392.3397594
One suggested method of improving even more upon the current systems which rely on a lot of user features to recommend is through images. For example, if two influencers have food images often on their page, the engine can use the photos to categorize influencers and create similarity scores. This allows for better recommendation systems beyond what a person and their friends are viewing, and it uses more up to date image processing technology to allow for even more accurate recommendation systems.
https://www.inf.unibz.it/~ricci/ISR/papers/p293-davidson.pdf
This is an older paper by YouTube on their recommendation system. Before AI and ML, newly uploaded YouTube videos lacked a lot of metadata because users don't fill them in. This makes it harder for the videos to be recommended because they lack features. As a result, YouTube created a way to pair videos, in which if a user watched the same videos in 24 hours, there would be a pairing that is store in a system. This allows users and the videos they find and watch to then be recommended to other users, creating recommendations in a world absent of strong metadata.
State Of The Art Research
Your Problem Topic