Closed scriptzteam closed 6 years ago
There are unfortunately a couple issues with the site that'll make it hard to get right:
dht_results
table in API responses.[tvN] ì‚¼ì‹œì„¸ë¼ ë°”ë‹¤ëª©ìž¥íŽ¸.E05.170901.720p-NEXT.mp4
(id=106189))<description>
<![CDATA[
<b>ID:</b> 106760<br /><b>MAGNET:</b> magnet:?xt=urn:btih:7eb8b1cc32c2078afb04e3c6aa842f9a7288afdb&dn=%D0%9A%D0%BE%D0%BC%D0%BF%D0%BE%D0%BD%D0%B5%D0%BD%D1%82%D1%8B_%D0%B8_%D1%82%D0%B5%D1%85%D0%BD%D0%BE%D0%BB%D0%BE%D0%B3%D0%B8%D0%B8-2014<br /><b>NAME:</b> Компоненты_и_технологии-2014<br /><b>SIZE:</b> 542.95MB<br /><b>DISCOVERED:</b> 2017-09-03 18:30:38
]]>
</description>
That said, I'm gonna try and get this site implemented.
I dont see the characters like you see them :)
Check https://xbit.pw/?id=106196
Output name is [tvN] 삼시세끼 바다목장편.E05.170901.720p-NEXT.mp4
All you need to do is URLDECODE the name part :)
%5BtvN%5D+%EC%82%BC%EC%8B%9C%EC%84%B8%EB%81%BC+%EB%B0%94%EB%8B%A4%EB%AA%A9%EC%9E%A5%ED%8E%B8
-->
https://urldecode.org/?text=%255BtvN%255D%2B%25EC%2582%25BC%25EC%258B%259C%25EC%2584%25B8%25EB%2581%25BC%2B%25EB%25B0%2594%25EB%258B%25A4%25EB%25AA%25A9%25EC%259E%25A5%25ED%258E%25B8&mode=decode
AAnd also your ID about you talkin:
https://xbit.pw/?id=106189
http://urldecode.org/?text=%25D0%259F%25D0%25BE%25D0%25BF%25D1%2581%25D0%25BE%25D0%25B2%25D1%258B%25D0%25B9%2B%25D1%2580%25D0%25B0%25D0%25B9.%2B%25D0%25A1%25D1%2583%25D0%25BF%25D0%25B5%25D1%2580%25D1%2581%25D0%25B1%25D0%25BE%25D1%2580%25D0%25BD%25D0%25B8%25D0%25BA%2B%25D0%25BE%25D1%2582%2B%25D0%25A0%25D1%2583%25D1%2581%25D1%2581%25D0%25BA%25D0%25BE%25D0%25B3%25D0%25BE%2B%25D1%2580%25D0%25B0%25D0%25B4%25D0%25B8%25D0%25BE%2B%25282016%2529&mode=decode
-->
Попсовый рай. Суперсборник от Русского радио (2016)
:)
I saw the issue with the names appear in the JSON API, so maybe the bug is in how that is generated.
The ID was a copy-paste error, my bad.
@skwerlman Can we consider starting working on the xBit crawler soon? Or is it bound to be a massive headache? (which doesn't bother me, but it will just lower its priority ;)
It doesn't look too hard, although the RSS feed is malformed, and neither the feed nor the JSON api contain seeder/leecher counts.
I can either set those to 0, or give up the RSS/API speed improvement and scrape each torrent's detail page. Which is preferable?
Ideally we should send them an email and kindly ask them to add the relevant details to their API… I don't like that but let's not make waves yet and use the API, with its flaws :/
investigating further, it doesn't look like detail pages have seed/leecher counts either, so i guess we'll have to default to 0 either way.
@skwerlman yeah and I don't feel like crawling the DHT for the infohashes :p Do you think you can spend some time doing this crawler? You can base its architecture on the EZTV crawler :)
haha that's exactly what i'm doing right now
Fantastique :)
https://xbit.pw/readme