giosh94mhz / GeonamesBundle

Symfony2 bundle to import and use Geonames toponyms
MIT License
6 stars 8 forks source link

LogicException on geonames:import #3

Open mekras opened 10 years ago

mekras commented 10 years ago

Got error " [LogicException] You can't regress the progress bar" when running geonames:import.

#0 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Event/ImportOutputSubscriber.php(158): Symfony\Component\Console\Helper\ProgressHelper->setCurrent(-4)
#1 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Event/ImportOutputSubscriber.php(94): Giosh94mhz\GeonamesBundle\Event\ImportOutputSubscriber->progress(-4, -4)
#2 [internal function]: Giosh94mhz\GeonamesBundle\Event\ImportOutputSubscriber->onDownloadProgress(Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent), 'geonames.import...', Object(Symfony\Component\HttpKernel\Debug\TraceableEventDispatcher))
#3 vendor/symfony/symfony/src/Symfony/Component/HttpKernel/Debug/TraceableEventDispatcher.php(392): call_user_func(Array, Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent), 'geonames.import...', Object(Symfony\Component\HttpKernel\Debug\TraceableEventDispatcher))
#4 [internal function]: Symfony\Component\HttpKernel\Debug\TraceableEventDispatcher->Symfony\Component\HttpKernel\Debug\{closure}(Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent), 'geonames.import...', Object(Symfony\Component\EventDispatcher\ContainerAwareEventDispatcher))
#5 vendor/symfony/symfony/src/Symfony/Component/EventDispatcher/EventDispatcher.php(164): call_user_func(Object(Closure), Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent), 'geonames.import...', Object(Symfony\Component\EventDispatcher\ContainerAwareEventDispatcher))
#6 vendor/symfony/symfony/src/Symfony/Component/EventDispatcher/EventDispatcher.php(53): Symfony\Component\EventDispatcher\EventDispatcher->doDispatch(Array, 'geonames.import...', Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent))
#7 vendor/symfony/symfony/src/Symfony/Component/EventDispatcher/ContainerAwareEventDispatcher.php(167): Symfony\Component\EventDispatcher\EventDispatcher->dispatch('geonames.import...', Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent))
#8 vendor/symfony/symfony/src/Symfony/Component/HttpKernel/Debug/TraceableEventDispatcher.php(139): Symfony\Component\EventDispatcher\ContainerAwareEventDispatcher->dispatch('geonames.import...', Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent))
#9 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Import/ImportDirector.php(144): Symfony\Component\HttpKernel\Debug\TraceableEventDispatcher->dispatch('geonames.import...', Object(Giosh94mhz\GeonamesBundle\Event\OnProgressEvent))
#10 [internal function]: Giosh94mhz\GeonamesBundle\Import\ImportDirector->Giosh94mhz\GeonamesBundle\Import\{closure}(-4, -4)
#11 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Utils/CurlDownload.php(132): call_user_func(Object(Closure), -4, -4)
#12 [internal function]: Giosh94mhz\GeonamesBundle\Utils\CurlDownload->Giosh94mhz\GeonamesBundle\Utils\{closure}(0, 0)
#13 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Utils/CurlDownload.php(146): call_user_func(Object(Closure), 0, 0)
#14 [internal function]: Giosh94mhz\GeonamesBundle\Utils\CurlDownload->Giosh94mhz\GeonamesBundle\Utils\{closure}(0, 0, 0, 0)
#15 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Utils/CurlDownload.php(242): curl_multi_exec(Resource id #426, 4)
#16 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Utils/CurlDownload.php(153): Giosh94mhz\GeonamesBundle\Utils\CurlDownload->curlMultiDownload(Array)
#17 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Import/ImportDirector.php(148): Giosh94mhz\GeonamesBundle\Utils\CurlDownload->download()
#18 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Import/ImportDirector.php(59): Giosh94mhz\GeonamesBundle\Import\ImportDirector->download()
#19 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Command/AbstractImportCommand.php(50): Giosh94mhz\GeonamesBundle\Import\ImportDirector->import()
#20 vendor/symfony/symfony/src/Symfony/Component/Console/Command/Command.php(241): Giosh94mhz\GeonamesBundle\Command\AbstractImportCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#21 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(888): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#22 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(191): Symfony\Component\Console\Application->doRunCommand(Object(Giosh94mhz\GeonamesBundle\Command\ImportCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#23 vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(96): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#24 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(121): Symfony\Bundle\FrameworkBundle\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#25 app/console(27): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput))
#26 {main}./console geonames:import

The cause is that curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD) returns -1 when download size not known at line 105 in CurlDownload.

giosh94mhz commented 10 years ago

Thanks for the report. I've fixed it. Please note that I'm still in the early stages of this bundle and many thing will change. I think that in the near future I will migrate to the Guzzle library for downloading.

mekras commented 10 years ago

It seems that problem still not fixed. Now downloadsSize is always equal to zero if curl_getinfo returns -1. So nothing will be downloaded at all.

giosh94mhz commented 10 years ago

@mekras I'm not able to reproduce the issue, both with my unit testing and with "dump files" download.

The call to CurlDownload::requestContentLength is just for displaying progress, so it should not prevent download. The curl_info which trigger the error is bound to an HEAD request to the Geonames dump files folder. Just to exclude some connection/configuration error, try this in a shell on your side:

curl -I http://download.geonames.org/export/dump/allCountries.zip 
mekras commented 10 years ago
mekras@mekras: curl -I http://download.geonames.org/export/dump/allCountries.zip 
HTTP/1.1 200 OK    
Date: Fri, 07 Feb 2014 11:57:13 GMT
Server: Apache/2.2.17 (Linux/SUSE)
Last-Modified: Fri, 07 Feb 2014 05:12:18 GMT
ETag: "33c18b2-ebb7b82-4f1ca08abb080"
Accept-Ranges: bytes
Content-Length: 247167874
Content-Type: application/zip                 
X-Pad: avoid browser bug

But if I insert code

print_r(curl_info($ch)); die;

into requestContentLength method then I got:

Array
(
    [url] => http://download.geonames.org/export/dump/featureCodes_en.txt
    [content_type] => 
    [http_code] => 0
    [header_size] => 0
    [request_size] => 0
    [filetime] => -1
    [ssl_verify_result] => 0
    [redirect_count] => 0
    [total_time] => 0.000123
    [namelookup_time] => 0
    [connect_time] => 0
    [pretransfer_time] => 0
    [size_upload] => 0
    [size_download] => 0
    [speed_download] => 0
    [speed_upload] => 0
    [download_content_length] => -1
    [upload_content_length] => -1
    [starttransfer_time] => 0
    [redirect_time] => 0
    [certinfo] => Array
        (
        )

    [primary_ip] => 
    [primary_port] => 0
    [local_ip] => 
    [local_port] => 0
    [redirect_url] => 
)

Backtrace:

#0 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Utils/CurlDownload.php(132): Giosh94mhz\GeonamesBundle\Utils\CurlDownload->requestContentLength()
#1 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Import/ImportDirector.php(148): Giosh94mhz\GeonamesBundle\Utils\CurlDownload->download()
#2 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Import/ImportDirector.php(59): Giosh94mhz\GeonamesBundle\Import\ImportDirector->download()
#3 vendor/giosh94mhz/geonames-bundle/Giosh94mhz/GeonamesBundle/Command/AbstractImportCommand.php(50): Giosh94mhz\GeonamesBundle\Import\ImportDirector->import()
#4 vendor/symfony/symfony/src/Symfony/Component/Console/Command/Command.php(241): Giosh94mhz\GeonamesBundle\Command\AbstractImportCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#5 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(888): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#6 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(191): Symfony\Component\Console\Application->doRunCommand(Object(Giosh94mhz\GeonamesBundle\Command\ImportCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#7 vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php(96): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#8 vendor/symfony/symfony/src/Symfony/Component/Console/Application.php(121): Symfony\Bundle\FrameworkBundle\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#9 app/console(27): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput))

From console:

mekras@mekras: curl -I http://download.geonames.org/export/dump/featureCodes_en.txt
HTTP/1.1 200 OK
Date: Fri, 07 Feb 2014 12:08:37 GMT
Server: Apache/2.2.17 (Linux/SUSE)
Last-Modified: Fri, 07 Feb 2014 04:54:29 GMT
ETag: "33c18ad-deba-4f1c9c8f40b40"
Accept-Ranges: bytes
Content-Length: 57018
Content-Type: text/plain; charset=utf-8

Some more info:

giosh94mhz commented 10 years ago

I have a less recent version of both PHP and cURL, so I don't think (hope :) it is a regression.

There are to weird things: 1) namelookup_time and connect_time are 0, but in my tests are something meaningful; 2) command line works perfectly.

So maybe it's just a PHP module problem. Just to be sure check that the configuration contain something like (reduced):

curl

cURL support => enabled
Largefile => Yes
libz => Yes
Protocols => dict, file, ftp, ftps, gopher, http, https, imap, imaps, ldap, pop3, pop3s, rtmp, rtsp, scp, sftp, smtp, smtps, telnet, tftp
ZLib Version => 1.2.7

Also, you can try to add a simple "return 1" instead of the body of requestContentLength. You'll get a weird progress bar, but the download should happens anyway. If that's the case maybe there is some problem with ''curl_copy_handle''

mekras commented 10 years ago

The same problem on FreeBSD with PHP 5.3.28 and cURL 7.35.0 Configuration is OK.

giosh94mhz commented 10 years ago

I reopen the issue since it is not fixed.

Still I'm clueless on where to start. 1) Have you tried the return 1 trick from the function requestContentLength? 2) The UnitTests of CurlDownload are affected of the issue?

In the meanwhile, I'm working on some code changes to use Guzzle as a download adapter. CurlDownload will still be there anyway, so I hope we get this fixed.

mekras commented 10 years ago

1) Have you tried the return 1 trick from the function requestContentLength?

Not yet. I'm trying to determine the cause of the error.

2) The UnitTests of CurlDownload are affected of the issue?

I'll check it on the next week.

mekras commented 10 years ago

Download tests failed:

There were 2 failures:

1) Giosh94mhz\GeonamesBundle\Tests\Utils\CurlDownloadTest::testDownload
Failed asserting that 0 is greater than 0.

GeonamesBundle/Tests/Utils/CurlDownloadTest.php:77

2) Giosh94mhz\GeonamesBundle\Tests\Utils\CurlDownloadTest::testMultiDownload
Failed asserting that 0 is greater than 0.

GeonamesBundle/Tests/Utils/CurlDownloadTest.php:145
giosh94mhz commented 10 years ago

I've just pushed some updates and refactoring. Now you can switch to the Guzzle as a download provider, by setting:

giosh94mhz_geonames:
    download:
        adapter:    guzzle

For now the instance of Guzzle::Http::Client is not injected, but it defaults to cURL. So we can find if it is a cURL problem or actually a bug in CurlDownloaderAdapter (name refactored :)

mekras commented 10 years ago

Using Guzzle solves the problem. But it requires to set memory_limit at least 1G.

giosh94mhz commented 10 years ago

Mmm... if Guzzle solves it than there is definitely something wrong in CurlDownloadAdapter. Still I cannot reproduce it so... are you able to track down the issue by comparing Guzzle cURL classes to mine? By doing a quick check I see that it doesn't use the CURLOPT_FILE option to set the destination path, but use php://temp...

For the memory_limit the problem is the cache which call response->getBody on the dump. I think this will need some patch on Guzzle (when using the save_to option + caching) so I'll check it out in the next days.