miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
6.65k stars 706 forks source link

Google News RSS Feed Actual Links #1854

Open piyushgarg opened 1 year ago

piyushgarg commented 1 year ago

Google RSS Feed Links are of the form of

<link>
https://news.google.com/rss/articles/CBMiU2h0dHBzOi8vd3d3Lndhc2hpbmd0b25wb3N0LmNvbS93b3JsZC8yMDIzLzA0LzIzL2Jha2htdXQtZGVzdHJveWVkLWNpdHktdWtyYWluZS13YXIv0gEA?oc=5
</link>

When using miniflux External Link, it opened the above link first which is getting redirect to original link or website and because of this the user experience is not good at all due to the delay added in one http redirection and the above link is also trackable. Instead the actual link can be saved in the database. Wish I could change this however due to lack of knowledge of v2 source tree, I was not sure where this could go. Here is the go routine:

package main

import (
    "encoding/base64"
    "fmt"
    "regexp"
)

func main() {
    expression := "(https.*google[.]com.*/)([a-z0-9A-Z_]*)(\\?.*)"
    article := "https://news.google.com/rss/articles/CBMiU2h0dHBzOi8vd3d3Lndhc2hpbmd0b25wb3N0LmNvbS93b3JsZC8yMDIzLzA0LzIzL2Jha2htdXQtZGVzdHJveWVkLWNpdHktdWtyYWluZS13YXIv0gEA?oc=5"
    expr := regexp.MustCompile(expression)
    match := expr.FindStringSubmatch(article)
    base64str := ""
    for i, s := range match {
        if i != 0 && s != "" {
            if i == 2 {
                base64str = s
                break
            }
            //fmt.Printf("index %d\n", i)
            //fmt.Printf("name %s\n", s)
        }
    }
    if base64str != "" {
        fmt.Printf("\nbase64 %s", base64str)
        data, err := base64.StdEncoding.DecodeString(base64str)
        if err != nil {
            fmt.Printf("%s\n", err)
        }
        //fmt.Printf("%q\n", data)
        j := 0
        k := 0
        for i := 0; i < len(data); i++ {
            // val := data[i]
            // fmt.Printf("%d:%d ", i, val)
            // http 104 116 116 112
            if j == 0 && 104 == data[i] && string(data[i:i+4]) == "http" {
                j = i
            }
            //  unwanted character 210
            if j > 0 && data[i] == 210 {
                k = i
                break
            }
        }
        if j > 0 && k > j {
            //fmt.Printf("\nj-%d k-%d", j, k)
            fmt.Printf("\nfinal url %s", data[j:k])
        }
    }
}
piyushgarg commented 1 year ago

It is my sincere request to include it. Let me know in which file it would go, I will create a pull request. @fguillot

jansendotsh commented 8 months ago

For now, it's not too miserable to just click through and view things on the web but pulling the original link into Miniflux would be a really nice add.