-
# Goal
Show TACC news articles [list] on LCCF. [Include pagination.] — [TUP-706](https://tacc-main.atlassian.net/browse/TUP-706).
# Why
- Show news, that is directly relevant to 2 sites, on b…
-
There currently appears to be no coverage of Australian news websites. I really lack the time to make a PR to add these, but I've created a list in case there is interest in adding them.
Most import…
-
this data will not be persisted, just an experiment for future use and feasibility
-
```
youtube-dl --verbose --no-mtime --restrict-filenames --no-part --verbose --age-limit 50 --get-filename -f 'worst[height>=360][ext*=mp4]/worst/bestvideo[height>=360][ext=mp4]' -o '%(title)s_fmt_%(…
-
The website has been demonstrating as largest fake news and propagandas against opposition and muslim in India As such, it needs to be added to the list.
Source: https://en.wikipedia.org/wiki/OpInd…
-
This is a recurring task as here we will daily do the aggregation of news articles from source websites given in document. We should try to get at least daily 6 articles.
https://docs.google.com/do…
-
Over here in the Media Cloud project we're seeing poor performance on the content extraction task for a variety of pages that include links to other "related" stories at the end of article content. Ou…
-
# Objective
Develop scripts to efficiently scrape Tibetan news articles from multiple sources, starting with the Voice of Tibet (VOT) website, and store them in a structured format for training a mach…
-
Hi, it seems that the authorities in Russia have blocked BBC and German station DW a couple of hours ago.
We are witnessing a drop in traffic. I had a cron job running a simple curl of bbc.com usin…
-
### Description:
We have several websites containing Tibetan literature data that need to be scraped to gather as much valuable information as possible for training our LLM. The task involves not only…