Schroedinger-Hat / youtube-to-anchorfm

An automation process to convert YouTube video into audio file and uploading it to Anchor.fm podcast
MIT License
127 stars 70 forks source link

Publishing from a channel or playlist #4

Closed PovilasID closed 10 months ago

PovilasID commented 3 years ago

Hey,

This looks very useful and thanks for sharing.

Have you considered adding publishing not a single video but of playlist or a channel? Workflow being if a video is published on there it gets auto redistributed into anchor.

TheJoin95 commented 3 years ago

Hi, thank you.

Now we are not looking to add this feature. Every contribution is really appreciated, feel free to do a pull request :)

jeremyzilar commented 3 years ago

This would be the best! Especially from a playlist ⭐

TheJoin95 commented 3 years ago

@jeremyzilar @PovilasID we are working on this feature :)

ghost commented 2 years ago

https://m.youtube.com/playlist?list={ID}/videos?pbj=1 and https://m.youtube.com/channel/{ID}/videos?pbj=1 returns JSON which you can parse list of videos.

After grabbing list of videos, we can start uploading them same way we doing now, individually.

I guess, we also need to create a local file playlist_channel_ID_anchor_username to specify uploaded video IDs to prevent uploading same files

weltonrodrigo commented 2 years ago

The way I see it, its a matter of synchronizing anchor's RSS to the youtube playlist, matching video_id with a tag on the episode description.

Necessary to implement a script to delete from anchor, in case a video is deleted from the original playlist (can be a feature).

Right now, you can process a full playlist (one way only) with

curl https://scc-youtube.vercel.app/playlist-items/PLoXdlLuaGN8ShASxcE2A4YuSto3AblDmX \
    | jq '.[].contentDetails.videoId' -r \
    | tac \
    | xargs -I% bash -c "jo id='%' > episode.json && git commit -am % && git push"

https://scc-youtube.vercel.app/playlist-items is from https://github.com/ThatGuySam/youtube-json-server jo is a json generator https://github.com/jpmens/jo tac is a command present in most linuxes and on mac with brew install coreutils. Its from reversing the list from older to newer. Remove if you want to upload in the order presented on youtube. jq is a json processor https://stedolan.github.io/jq/

This must be run on the folder where your episode.json is.

abe-101 commented 2 years ago

@weltonrodrigo Just so I understand this bash script pulls all the video's from the playlist, puts each individual video in the episode.json file, commits, and pushs to github. now on github there will be a lineup of actions one for each video Does this sound correct?

weltonrodrigo commented 2 years ago

Yeah. Pretty much it.

I suggest you run it late at night (pacific time). Github action runners seem much more responsive.

Em 4 de mai. de 2022, à(s) 4:57 PM, Abe @.***> escreveu:

@weltonrodrigo https://github.com/weltonrodrigo Just so I understand this bash script pulls all the video's from the playlist, puts each individual video in the episode.json file, commits, and pushs to github. now on github there will be a lineup of actions one for each video Does this sound correct?

— Reply to this email directly, view it on GitHub https://github.com/Schrodinger-Hat/youtube-to-anchorfm/issues/4#issuecomment-1117819280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMRQZFJ62LSUDHED34FXGTVILJA7ANCNFSM4WCH324A. You are receiving this because you were mentioned.

abe-101 commented 2 years ago

One other question, this parses a youtube playlist I'd like to parse a channel Do you know how i would change the URL?

weltonrodrigo commented 2 years ago

You can either:

1) parse a channel using the youtube API. This is a more complex endeavour, as you'll need a google credential, etc. 2) use youtube-dl to dump the channel as a JSON an extract the ids:

youtube-dl https://www.youtube.com/user/ozzymanreviews --dump-json \
    | jq '.id' -r \
    | tac \
    | xargs -I% bash -c "jo id='%' > episode.json && git commit -am % && git push"
abe-101 commented 2 years ago
curl https://scc-youtube.vercel.app/playlist-items/PLoXdlLuaGN8ShASxcE2A4YuSto3AblDmX \
    | jq '.[].contentDetails.videoId' -r \
    | tac \
    | xargs -I% bash -c "jo id='%' > episode.json && git commit -am % && git push"

This little script is a brilliant hack. A small side effect of this script, although the script reverses the order of the videos from earliest to recent to preserve the order, it isn't perfect as the github action will run many simultaneously and the shorter videos will finish uploading first.

Implementing a feature like #32 would fix this.

weltonrodrigo commented 2 years ago

You can deal with out of order videos by setting episode number on anchor. You’ll have to set episode number on all episodes, thought.

Em 9 de mai. de 2022, à(s) 9:13 PM, Abe @.***> escreveu:

The way I see it, its a matter of synchronizing anchor's RSS to the youtube playlist, matching video_id with a tag on the episode description.

Necessary to implement a script to delete from anchor, in case a video is deleted from the original playlist (can be a feature).

Right now, you can process a full playlist (one way only) with

curl https://scc-youtube.vercel.app/playlist-items/PLoXdlLuaGN8ShASxcE2A4YuSto3AblDmX \ | jq '.[].contentDetails.videoId' -r \ | tac \ | xargs -I% bash -c "jo id='%' > episode.json && git commit -am % && git push" https://scc-youtube.vercel.app/playlist-items is from https://github.com/ThatGuySam/youtube-json-server https://github.com/ThatGuySam/youtube-json-server jo is a json generator https://github.com/jpmens/jo https://github.com/jpmens/jo tac is a command present in most linuxes and on mac with brew install coreutils. Its from reversing the list from older to newer. Remove if you want to upload in the order presented on youtube. jq is a json processor https://stedolan.github.io/jq/ https://stedolan.github.io/jq/ This must be run on the folder where your episode.json is.

This little script is a brilliant hack. I just used it to upload over 70 videos! Thank you! A small side effect of this script, although the script reverses the order of the videos from earliest to recent to preserve the order, it isn't perfect as the github action will run many simultaneously and the shorter videos will finish uploading first.

— Reply to this email directly, view it on GitHub https://github.com/Schrodinger-Hat/youtube-to-anchorfm/issues/4#issuecomment-1121704361, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMRQZF2ZQEVIGKNKLLVPZDVJGS3RANCNFSM4WCH324A. You are receiving this because you were mentioned.

abe-101 commented 2 years ago

The way I see it, its a matter of synchronizing anchor's RSS to the youtube playlist, matching video_id with a tag on the episode description.

Necessary to implement a script to delete from anchor, in case a video is deleted from the original playlist (can be a feature).

A thought on comparing/syncing youtube to podcast. If we add the YouTube link to the end of the description of each podcast we can then use the id within the link to compare to the youtube video

abe-101 commented 2 years ago

Building upon @weltonrodrigo idea I've created a python script which will trigger the action on the whole channel. I've warped it up in a Docker container ready to be used in a github action:

name: Upload Full YouTube Channel to Podcast

# Controls when the workflow will run
on:
  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:
    inputs:
      environment:
        description: 'YouTube Channel URL'
        type: enviroment
        required: true

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  # This workflow contains a single job called "build"
  build:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    # Steps represent a sequence of tasks that will be executed as part of the job
    steps:
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
      - uses: actions/checkout@v3
        with:
          token: ${{secrets.PERSONAL_ACCESS_TOKENS}}
      - name: Build YouTube Channel Podcast
        uses: abe-101/Build-YouTube-Podcast@main
        env:
          URL: ${{ github.event.inputs.environment }}

This avoid having to use the command line. user will manually trigger action after pasting the url

abe-101 commented 2 years ago

The actual python script is here

abe-101 commented 2 years ago
     - name: Build YouTube Channel Podcast
        uses: abe-101/Build-YouTube-Podcast@main
        env:
          URL: ${{ github.event.inputs.environment }}

Apparently this might not be the best idea. It might violate githubs T&C and get the account suspended.... I suspect the issue being one action triggering many others...

weltonrodrigo commented 1 year ago

I have been thinking about this for quite some time now.

What we need is a sync phase.

The sync phase will compare two lists: the podcast list vs the youtube list.

This comparison need a key, something present in both lists for each episode. It can be the video url (using URL_IN_DESCRIPTION: true), can also be the video title.

The result of this comparison is a set of:

We can externalize the option to be APPEND_ONLY. If deletions are desired, we need code to delete an anchor episode based on the title.

The algorithm:

  1. Get youtube playlist, extract titles or urls for comparison.
  2. Get anchor rss feed, extract titles or urls from description for comparison.
  3. Generate a deletion and append list base on the differences
  4. Get from the RSS the title for every episode to be deleted
  5. Delete episodes one by one with function deleteEpisode(title).
  6. Append new episodes one by one with function appendEpisode(video_id.
abe-101 commented 1 year ago

I've been using URL_IN_DESCRIPTION: true in my episodes which allows me to build (in python) a set() of published id's using the achorfm rss feed I can then check if a id has been published

abe-101 commented 1 year ago

I have been thinking about this for quite some time now.

What we need is a sync phase.

I avoid using GitHub action as in my opinion it violates the TOS I instead run the tool locally with npm start

Here's how my workflow is: Another fellow updates a googlsheet with links to videos that need to be converted. I then run a python script that pulls the ID's from the googlesheet, marks then as triggered and triggers the action to convert the the video into a podcast and lastly the scripts checks if the podcast published successfully and marks it so on the spreadsheet

Here is the python script https://github.com/abe-101/googlesheet-api-anchorFM

TheJoin95 commented 10 months ago

If there are no updates I think we can close this cc @matevskial @abe-101

abe-101 commented 10 months ago

I vote we close this