dartmouth-cs98-23f / project-short-learning

project-short-learning created by GitHub Classroom
0 stars 0 forks source link

State Of the Art Research #5

Closed banshee56 closed 8 months ago

banshee56 commented 11 months ago

State Of The Art Research

Your Problem Topic

banshee56 commented 11 months ago

State of the Art Research

Crowdsourcing step-by-step information extraction to enhance existing how-to videos | [method]

Paper Link

Screenshot 2023-09-30 at 7 38 51 PM

We discussed possibly building an AI system that can curate and edit educational videos on a particular topic from other websites like YouTube or Tiktok to create a single video/video series on our app. The ideal AI system would have to figure out sections of interest in these videos, join pieces of content in a way that the final product would be comprehendible, and would have to make sufficient changes to the video for the generated content to fall within Fair Use laws. An alternative that seemed to work in this rather well-cited study was to use crowdsourcing for editing educational video content.

banshee56 commented 11 months ago

How Civil are Comments on TikTok’s Educational Videos?

Insights for Learning at Scale | [background]

Paper Link

This paper discusses the effects of aggressive behavior in TikTok comment sections of educational content. This may provide valuable insights on how to monitor the community so that our users are able to maximize their learning gains on our platform. We can thus learn from this and perhaps foster a more nurturing community than one of our biggest would-be competitors, TikTok.

banshee56 commented 11 months ago

Exploring TikTok as an Educational Tool for Speech-Language Pathologists, Special Education, and General Education | [background]

Paper Link

I was not able to get access to this paper but I think it might prove valuable in terms of having a comprehensive study on the educational possibilities of platforms like TikTok (and ours). This paper is also very recent, so it likely points towards critical, up-to-date sources that may end up being useful for our purposes.

banshee56 commented 11 months ago

Deep Learning-based Short Video Recommendation and Prefetching for Mobile Commuting Users| [method]

Paper Link

Screenshot 2023-09-30 at 7 39 21 PM Screenshot 2023-09-30 at 7 40 20 PM

In our design doc, we briefly talk about prefetching logic for our videos to make sure the content is delivered quickly. This paper discusses a method to do this utilizing deep learning techniques in unstable network conditions.

banshee56 commented 11 months ago

Bandwidth-Efficient Multi-video Prefetching for Short Video Streaming | [method]

Paper Link

Another method that aims to solve the above issue. This research includes an open source discrete-event simulator which may be useful when we decide on how to proceed with prefetching.

jessieli24 commented 11 months ago

Developing Top Performing Contact-Center Agents Through Microlearning | [method, market]

Link

Describes how AI-powered microlearning – breaking down information into smaller pieces – has been used for training contact center agents at all levels, from new hires to long-term employees. Gives us qualitative goals to meet - ideally our product should be flexible, cost-effective, targeted, and engaging/interactive. Might help us think about applications and narrow the scope of our product if necessary.

jessieli24 commented 11 months ago

Microlearning: The Future of Professional Development [competitive, market]

Link

Provides examples of existing short-form video platforms with an educational focus, including EJ4, Google Primer, The Training Arcade, and Axonify. Some strategies they use to make their content engaging include animations, infographics, and chatbots.

jessieli24 commented 11 months ago

How to Restore Our Dwindling Attention Spans [background]

Link

Affirms the problem of decreasing attention spans. Research group at UC Irvine found that people averaged 150 seconds on one screen before switching in 2004, but by 2021, this declined to 47 seconds. While digital distractions are mostly to blame, this essay suggests that completely unplugging from technology is impractical, so we should focus on regulating screen time instead. Maybe this is something we might think about when developing our own product - how to engage users without taking away from their productivity.

jessieli24 commented 11 months ago

How TikTok’s algorithm made it a success: ‘It pushes boundaries’ | [competitive]

Link

Aspects of TikTok that distinguish the platform from its competitors. We might consider incorporating some of these in our own project:

jessieli24 commented 11 months ago

Deep Neural Networks for YouTube Recommendations | [method]

Link

How YouTube generates video recommendations with a neural network. Discusses feature engineering and explains how layers are organized within the network. Could help us design our ML model.

jessieli24 commented 11 months ago

MVC vs. MVVM on iOS: Key Differences With Swift Examples | [method]

[Link]((https://www.netguru.com/blog/mvc-vs-mvvm-on-ios-differences-with-examples)

In addition to Apple's own documentation, this blog post might help us decide how we organize our app. It summarizes the differences between two possible architectures, MVC and MVVM (most commonly used for mobile apps). This article provides an additional example of how an MVVM application might be organized in practice, while this one provides some generic advice on how folders should be structured. Finally, this explains app management.

linkevin281 commented 11 months ago

Netflix System Design

Link

A white-boarded design to Netflix to ensure scalability and high availability to users. Includes basic design to some services and a sensible split to ensure matching with microservices on deployment.

linkevin281 commented 11 months ago

Youtube System Design

Link

A white-boarded design to Youtube that can be used to reference high for video streaming, transcoding, and storage methods. This article provides an extra piece in that it also helps explain messaging queues that may be necessary to asynchronously connect our services: video scraping, ML and storage.

linkevin281 commented 11 months ago

Video on Demand on AWS

Link

A few solutions propositions with Cloudformation for fully AWS based solution microservice solutions to out streaming service. The costs of adopting a fully AWS solution will be higher than creating our own deployed on an ec2. However, it still offers interesting insight on horizontal scaling advantages and general high level system design.

linkevin281 commented 11 months ago

Kaggle Datasets for Lists of Topics to be Sorted into Groups

Link1 Link2

siavava commented 11 months ago

Spotify Design Research on Designing with Empath for Access-Constrained Users

https://spotify.design/article/performance-cards-designing-with-empathy-and-meaning

I think this is an important dimension to keep in mind as we do our design work — that some of our target users may have atypical internet setups and/or devices, usage abilities, and accessibility needs. We do not have to accommodate every potential user, but we do not want to be a terrible UX for them either.

This can also be done down the line! In term 5 of the project (if we get there). It's not a must-have, it's a nice-to-have.

zhenyiplusone commented 11 months ago

Deep Neural Networks for YouTube Recommendations [Competitive]

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf

The YouTube recommendation engine works with a few tricks. First, the feed is generated by gathering a subset of the millions of videos on YouTube, generating a thousand video subset loosely similar to the user's preferences. This subset is loosely generated and not calculated precisely so speed is optimized. With this smaller subset, the recommendation engine then prioritizes the videos showing up based on close matches to user's preferences.

To generate these candidates for recommendation, many features are trained on and generated. One of which is the video upload time. Since YouTube prioritizes freshness of videos to keep content fresh, there is a curve where videos uploaded 1 hour to 2 days has a higher emphasis on the upload time feature. This allows for more contents to pop up, avoiding video cycles.

Other things that help create the recommendation engine are whether videos have relationships with each other (episodes in a series), come up through similar searches, and how long the user watched similar videos.

zhenyiplusone commented 11 months ago

Tiktok - Monolith: Real Time Recommendation System With Collisionless Embedding Table [Competitive]

https://arxiv.org/pdf/2209.07663.pdf

Monolith is TikTok's recommendation engine and it uses a "Collisionless Embedding Table" to help with machine learning. For one, it separates its features into sparse features and dense features. To properly store sparse features, they stored it in a hash table since the sparseness of the features allow for low collisions while being able to get features in O(1) time.

Furthermore, Monolith uses both batch training in the back end and online training. The model allows for the bulk of the training to happen in the backend while a part of it happens online so the updates to the model are happening as new data comes in. This prevents problems like ChatGPT where is it solely trained on existing data rather than continuous incoming data.

zhenyiplusone commented 11 months ago

Scaling the Instagram Explore recommendations system [Competitive]

https://engineering.fb.com/2023/08/09/ml-applications/scaling-instagram-explore-recommendations-system/

Meta's Instagram has a similar model to YouTube where they have a funnel structure for feed retrieval. In which at each stage, the algorithm for ranking becomes more complex in order to create a data funnel and find the best videos for recommendation. Specifically, one thing they have is the Two Tower Model - in which user data and item (video/images) is fed into two models, the similarities are found, and it generates the final result. This way, when a user is scrolling through specific videos, they're recommended on items similar to what they have been interested in and is similar to the video.

zhenyiplusone commented 11 months ago

Keeping up with the Influencers: Improving User Recommendation in Instagram using Visual Content [Method]

https://dl.acm.org/doi/pdf/10.1145/3386392.3397594

One suggested method of improving even more upon the current systems which rely on a lot of user features to recommend is through images. For example, if two influencers have food images often on their page, the engine can use the photos to categorize influencers and create similarity scores. This allows for better recommendation systems beyond what a person and their friends are viewing, and it uses more up to date image processing technology to allow for even more accurate recommendation systems.

zhenyiplusone commented 11 months ago

The YouTube Video Recommendation System [Competitive]

https://www.inf.unibz.it/~ricci/ISR/papers/p293-davidson.pdf

This is an older paper by YouTube on their recommendation system. Before AI and ML, newly uploaded YouTube videos lacked a lot of metadata because users don't fill them in. This makes it harder for the videos to be recommended because they lack features. As a result, YouTube created a way to pair videos, in which if a user watched the same videos in 24 hours, there would be a pairing that is store in a system. This allows users and the videos they find and watch to then be recommended to other users, creating recommendations in a world absent of strong metadata.