WilliamDevin23 / Streamlit_StockDash

https://stockdash-live.streamlit.app
0 stars 0 forks source link

[BUG] Google News URL decode not working #1

Closed moehmeni closed 1 month ago

moehmeni commented 1 month ago

def get_link(link):
    try:
        link_2 = link.replace("https://news.google.com/rss/articles/", "")
        end_idx = link_2.index("?")
        link_2 = link_2[:end_idx]
        link_2 = re.sub(r"[^A-Za-z0-9\+=/]", "A", link_2)
        if len(link_2) % 4 != 0:
            link_2 += "=" * (4 - len(link_2) % 4)
        translated = base64.b64decode(link_2)
        translated = translated.decode("iso-8859-1")
        url_match = re.findall(r"https*:[\-\.A-Za-z0-9/]*", translated)[0]
        return url_match
    except Exception as e:
        print(e)
        return link

url = "https://news.google.com/rss/articles/CBMi2AFBVV95cUxQOHZlbFBOSXZDQTVDNWhibW9nMlUzaWpfbVRZaTNKMXd4VFNtQ2YxQWt2UmtDbHdia2xvbHZDMU03eXVabzFscDdMcHV4aGFnNW1zdU9zakVyaEFmMm1FVDVBRVotdktTbkJBOUFrT3dwNTY5bVNzZWRJQk1RT3l5SnBBeWdXS1laeVpwejQzN3luZjgwVjN0bFB5NkZSM2oxRXJ6Q0ItbDNMUDZJRTdEZXhjbUV1Z3NYMHdXV1hKV3N3YndWOVZjVE9uZlBGNkk0SS1mbTZ3b0Q?oc=5"
result = get_link(url)
print("Result:", result)
list index out of range
WilliamDevin23 commented 1 month ago

Hi! It seems that this is Google's matter that the string couldn't be decoded using base64. I've subscribed for this thread in case there are updates regarding this problem. https://support.google.com/news/publisher-center/thread/286847120/issue-with-final-url-retrieval-from-google-news-rss-feed?hl=en Thanks for your reminder 😄