DrJLR / xbmc-adult

Automatically exported from code.google.com/p/xbmc-adult
0 stars 0 forks source link

Improved aebn.net scraper #51

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Please consider the following patch to improve the aebn.net scraper

Original issue reported on code.google.com by kmi...@gmail.com on 3 Jan 2013 at 9:59

GoogleCodeExporter commented 8 years ago
Could you elaborate on the changes?!

Original comment by ltub...@gmail.com on 4 Jan 2013 at 11:17

GoogleCodeExporter commented 8 years ago
To be honest, it's more like a complete rewrite, as the original didn't work 
for me at all (others also report breakage): 
http://forum.xbmc.org/showthread.php?tid=94942&pid=1264990#pid1264990

If accepted, someone will need to update the gay scraper as well (should be 
just a s/straight/gay/g).

Original comment by kmi...@gmail.com on 4 Jan 2013 at 5:15

GoogleCodeExporter commented 8 years ago
I tested this and it is not matching titles properly it finds matches for 
random stuff and ends up associating the wrong movie to a file

Also the patch did not apply cleanly 
Hunk #1 succeeded at 1 with fuzz 1.

Maybe that is why it is working badly for me

Original comment by mrdougqu...@gmail.com on 5 Jan 2013 at 9:43

GoogleCodeExporter commented 8 years ago
Adding quotes to the query in the search url drastically improves the results. 
What do you think?

Original comment by mrdougqu...@gmail.com on 5 Jan 2013 at 10:11

GoogleCodeExporter commented 8 years ago
Attaching my whole file just in case.

As for the search query, works for me most of the time but YMMV. I found a lot 
depends on file naming - if not exactly the same as in AEBN, their fuzzy search 
results are not great and I sometimes need to refresh manually to enter a beter 
starting string after looking up on the web. With the quotes it might be too 
restrictive and return no matches at all. Neither outcome is ideal and requires 
manual intervention, so 50/50 decision for me.

Original comment by kmi...@gmail.com on 5 Jan 2013 at 12:54

GoogleCodeExporter commented 8 years ago
To ilustrate by an example, depending on how your file on disk was named:

lickland+par+2 returns tons of matches, the first being the desired one and all 
is good

"lickland+part+2" returns only the one desired match, again all is good

lickland+2 returns tons of bogus matches (the desired one might be somewhere in 
there)

"lickland+2" returns no matches

So I don't think we can improve the search query much, their search engine 
simply is not that great.

Original comment by kmi...@gmail.com on 5 Jan 2013 at 1:06

GoogleCodeExporter commented 8 years ago

Original comment by mrdougqu...@gmail.com on 5 Jan 2013 at 7:24

GoogleCodeExporter commented 8 years ago
This issue was closed by revision r185.

Original comment by mrdougqu...@gmail.com on 6 Jan 2013 at 10:28