gautamkrishnar / blog-post-workflow

Show your latest blog posts from any sources or StackOverflow activity or Youtube Videos on your GitHub profile/project readme automatically using the RSS feed
https://github.com/marketplace/actions/blog-post-workflow
GNU Affero General Public License v3.0
3.03k stars 269 forks source link

[Feature]: Remove newlines from RSS post titles #34

Closed nil0x42 closed 4 years ago

nil0x42 commented 4 years ago

Is your feature request related to a problem? Please describe. I use a twitter RSS feed as input, and sometimes titles containg newlines, leading to improper markdown formatting.

Describe the solution you'd like Replace occurences of \r\n & \n in the title by simple space

Describe alternatives you've considered n/a

Additional context Here's how titles get formatted: Screenshot_2020-10-10_13-22-26 Screenshot_2020-10-10_13-21-04

gautamkrishnar commented 4 years ago

You can now use item_exec parameter to do advanced text manipulation via JavaScript if required. Each post item will be available as the post variable: https://github.com/gautamkrishnar/blog-post-workflow/blob/0d45a71c69865f43ca406abbdce8df5b14f504f9/blog-post-workflow.js#L240-L254

You can do something like this:

name: Latest Tweets workflow
on:
  schedule: # Run workflow automatically
    - cron: '0 * * * *' 
jobs:
  update-readme-with-blog:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          feed_list: "https://rss.app/feeds/7YyygtUxDsIaLQ3s.xml"
          item_exec: |
                    post.title = post.title.replace('\n',' '); post.title = post.title.replace('\r\n',' ');
                    console.log("test\n world");

Another interesting use case example: https://github.com/ayushi7rawat/ayushi7rawat/blob/master/.github/workflows/youtube.yml

name: Latest youtube videos
on:
  schedule: # Run workflow automatically
    - cron: '5 * * * *' # Runs every hour, on the hour
  workflow_dispatch: # Run workflow manually (without waiting for the cron to be called), through the Github Actions Workflow page directly
jobs:
  update-readme-with-youtube:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          feed_list: "https://www.youtube.com/feeds/videos.xml?channel_id=UCvmONGrUQxL3B3PmSv1JQqQ"
          item_exec: "post.title = post.title.split('|')[0]"
          comment_tag_name: "YOUTUBE"
          commit_message: "Updated with the latest youtube video"

Stripping out html contents in description example: https://github.com/gautamkrishnar/blog-post-workflow/issues/74#issuecomment-878660357

name: Latest blog post workflow
on:
  schedule:
    - cron: 0 * * * *
  push:
    branches:
      - main
jobs:
  update-readme-with-blog:
    name: Update this repo's README with latest blog posts
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: gautamkrishnar/blog-post-workflow@master
        with:
          max_post_count: '5'
          feed_list: 'https://medium.com/feed/@nicko170'
          template: '- [$title]($url): $description $newline'
          date_format: 'UTC:ddd yyyy-mm-dd h:MM:ss TT Z'
          filter_comments: medium
          tag_post_pre_newline: 'true'
          item_exec: |
            post.description = post.description.replace(/<\/?[^>]+(>|$)/g, ""); 
gautamkrishnar commented 4 years ago

Released on https://github.com/gautamkrishnar/blog-post-workflow/releases/tag/1.3.1

nil0x42 commented 4 years ago

Wow ! I'm simply shocked by your reactivity !

nil0x42 commented 4 years ago

Just for the record. There is a problem with \n & \r literals in item_exec, so they must be escaped.

As an example, for my case, the correct item_exec is:

item_exec: "post.title = post.title.replace(/(?:\\r\\n|\\r|\\n)/g,' ');"
gautamkrishnar commented 4 years ago

Thanks for pointing it out @nil0x42

nil0x42 commented 4 years ago

@gautamkrishnar , is it possible to use this new feature to ignore an entry? Let's say I want to have 5 first entries that do not contain a specific string in title, is it possible with item_exec to just ignore entry?

gautamkrishnar commented 4 years ago

@nil0x42 thanks for the suggestion, just released this feature. You can ignore any item by setting post variable to null via javascript using the item_exec param.

Eg:

item_exec: "if (post.title.indexOf('example title to ignore') > -1) post = null;"

You may need to add proper escaping for some special characters.

nil0x42 commented 4 years ago

Hi ! Thank you again for the speed at which you adress issues ! I tried and it doesn't work because post is const, so i get TypeError: Assignment to constant variable. error.

Also, as item_exec migh alter title's length, it might be useful to put TITLE_MAX_LENGTH trimming block code after item_exec, so if item_exec changes the title-'s length, the trimming happens after.

gautamkrishnar commented 4 years ago

@nil0x42 oopsie, I will fix that, will update the code to use let instead. 👍 Happy to help.

gautamkrishnar commented 4 years ago

@nil0x42 thanks for your suggestions, it's now released: https://github.com/gautamkrishnar/blog-post-workflow/releases/tag/1.3.4