reneeb / cpan2ebook

22 stars 7 forks source link

Wrong Book for Module Name #12

Closed borisdaeppen closed 12 years ago

borisdaeppen commented 12 years ago

If I enter

Log::Log4perl

and press EPUB, I get

Tie-Log4perl-0.1.epub
borisdaeppen commented 12 years ago

The problem occurs because autocomplete from metacpan-api is used to match user-input.

Autocomplete has strange behavior here:

http://api.metacpan.org//v0/search/autocomplete?&q=Log++Log4perl

     {
        "_score" : 1.169038,
        "fields" : {
           "documentation" : "Tie::Log4perl",
           "release" : "Tie-Log4perl-0.1",
           "author" : "FRODWITH",
           "distribution" : "Tie-Log4perl"
        },
        [...]
     },
     {
        "_score" : 1.169038,
        "fields" : {
           "documentation" : "Log::Log4perl",
           "release" : "Log-Log4perl-1.36",
           "author" : "MSCHILLI",
           "distribution" : "Log-Log4perl"
        },
        [...]
     },
     {
        "_score" : 1.1590381,
        "fields" : {
           "documentation" : "Test::Log4perl",
           "release" : "Test-Log4perl-0.1001",
           "author" : "FOTANGO",
           "distribution" : "Test-Log4perl"
        },
        [...]
     },

Both, Tie::Log4perl and Log::Log4perl have the same score which is 1.169038. Even worse, Tie::Log4perl is placed on top. If I add the parameter &size=1 to the request (in the URL) I will only get Tie::Log4perl - but I was asking for a match witch Log::Log4perl.

This looks like a bug to me...

borisdaeppen commented 12 years ago

I opened an issue here: https://github.com/CPAN-API/cpan-api/issues/203

No idea if this is going to be fixed soon... or at all.

dvergin commented 12 years ago

Similarly a request for "Moo" returns the docs for "ppt".

I added a comment at https://github.com/CPAN-API/cpan-api/issues/203 describing this somewhat different example (i.e. in this case the lookup terms are entirely unrelated and there is some confusion about the "documentation" field shown for ppt at http://api.metacpan.org//v0/search/autocomplete?&q=Moo)

Question: Is there a work-around to "trick" perlybook onto returning the desired item?

[Edit] Curiouser and curiouser: If I uncheck "fetch complete release", I get a file called Module_Moo.mobi which contains the docs for Moo but which shows "Perl Module Documentation ppt-0.14" on its title page. The "Table of Contents" page and the actual contents are all of "Moo". Only the title page is mis-labeled. And it appears in my mobi library as "ppt-0.14".

borisdaeppen commented 12 years ago

It's quite some pain to work with CPAN, because everything can happen there :-)

As a first step I'll try to replace the autocomplete at serverside. It looked handy as I implemented it, but now it's the cause of a lot of errors.

If anybody can help: I need a function (maybe from MetaCPAN::API) which allows me to check if a given module-name is valid... meaning will return a result.

reneeb commented 12 years ago

For stuff like the perl_mlb issue (#14) we should maintain a "blacklist".

Part of the solution could be to grab two items in the resultlist and when a distribution is in that resultlist matches the search term exactly, we should provide the distribution documentation.

Can we start a FAQ in the wiki please? In that FAQ we can document such problems.

borisdaeppen commented 12 years ago

The issue with the wrong matching can be solved by using requests like

http://api.metacpan.org/v0/module/EBook::MOBI

for modules and

http://api.metacpan.org/v0/release/EBook-MOBI

for releases. By putting this into an eval{} we should be able to tell if the request will be a valid one... I'll implement that in the next few weeks or days.

borisdaeppen commented 12 years ago

closed in commit fc65f5b296f4f63d164d2de18ec66020e7811c01