Open Dzhuks opened 3 months ago
interesting! thanks for digging into whether it was a language or site issue. it is strange that VK produces that kind of garbled text, since that would usually indicate an unsupported encoding. it's unlikely they would be using an older standard, like ISO or Windows-1251.
I've got a few bugs like this queued up and will hopefully be writing fixes for them in the next week or two. I'll look for a cause this morning tho and will comment if I find it.
thanks for the report!
no obvious cause but Firefox's reader view displays the page correctly, so it is likely something to do with Slurp or Obsidian
interestingly, it slurped fine on my android device.
can you provide some detail on your setup? OS version and Obsidian version especially
OS version: Windows 11 Obsidian: 1.67 Slurp: 0.1.12
I encountered an issue while trying to extract text from articles on a Russian social media site, VK. The articles on VK were not processed correctly—the Russian text appeared garbled and unrecognizable. You can see an example of this issue in the article from this URL: Escape from Google Translate.
Initially, I suspected that the problem was due to the Russian language itself. However, I tested the extraction process on an article from a Russian news site, and it worked perfectly. Here's an example article that was processed correctly: How to Transfer Money to Kazakhstan from Russia in 2023-2024.
This indicates that the issue is specific to the VK platform rather than the Russian language as a whole.