Ardesco / driver-binary-downloader-maven-plugin

A Maven plugin that will download the WebDriver stand alone server executables for use in your mavenised Selenium project.
Apache License 2.0
92 stars 52 forks source link

Unexpected delay when downloading/extracting drivers for all platforms #22

Closed cowwoc closed 9 years ago

cowwoc commented 9 years ago

Version 1.0.4

Given the configuration:

<plugin>
                <groupId>com.lazerycode.selenium</groupId>
                <artifactId>driver-binary-downloader-maven-plugin</artifactId>
                <version>1.0.4</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>selenium</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <customRepositoryMap>repository-map.xml</customRepositoryMap>
                    <onlyGetDriversForHostOperatingSystem>false</onlyGetDriversForHostOperatingSystem>
                    <operatingSystems>
                        <windows>true</windows>
                        <linux>true</linux>
                        <osx>true</osx>
                    </operatingSystems>
                    <rootStandaloneServerDirectory>${basedir}/selenium-driver/</rootStandaloneServerDirectory>
                    <onlyGetLatestVersions>true</onlyGetLatestVersions>
                    <thirtyTwoBitBinaries>false</thirtyTwoBitBinaries>
                    <sixtyFourBitBinaries>true</sixtyFourBitBinaries>
                    <throwExceptionIfSpecifiedVersionIsNotFound>true</throwExceptionIfSpecifiedVersionIsNotFound>
                </configuration>
            </plugin>

I get this output:

--------------------------------------------------------
 DOWNLOADING SELENIUM STAND-ALONE EXECUTABLE BINARIES...
--------------------------------------------------------

 file:/C:/Users/Gili/Documents/realestate/importer/repository-map.xml is valid

Only get drivers for current Operating System: false
Download 32bit binaries: false
Download 64bit binaries: true
Download Latest Versions Only: true
Throw Exception If Specified Version Is Not Found: true

Archives will be downloaded to 'C:\Users\Gili\Documents\realestate\importer\selenium_standalone_zips'
Standalone executable files will be extracted to 'C:\Users\Gili\Documents\realestate\importer\selenium-driver'

Preparing to download Selenium Standalone Executable Binaries...

Archive file 'phantomjs-1.9.7-macosx.zip' exists   : true
Archive file 'phantomjs-1.9.7-macosx.zip' is valid : true
Binary 'phantomjs' Exists: true

Archive file 'chromedriver_linux64.zip' exists   : true
Archive file 'chromedriver_linux64.zip' is valid : true
Binary 'chromedriver' Exists: true

Archive file 'phantomjs-1.9.7-windows.zip' exists   : true
Archive file 'phantomjs-1.9.7-windows.zip' is valid : true
Binary 'phantomjs.exe' Exists: true

Archive file 'chromedriver_mac32.zip' exists   : true
Archive file 'chromedriver_mac32.zip' is valid : true
Binary 'chromedriver' Exists: true

Archive file 'chromedriver_win32.zip' exists   : true
Archive file 'chromedriver_win32.zip' is valid : true
Binary 'chromedriver.exe' Exists: true

Archive file 'phantomjs-1.9.7-linux-x86_64.tar.bz2' exists   : true
Archive file 'phantomjs-1.9.7-linux-x86_64.tar.bz2' is valid : true
Binary 'phantomjs' Exists: true

--------------------------------------------------------
SELENIUM STAND-ALONE EXECUTABLE DOWNLOADS COMPLETE
--------------------------------------------------------

However, there is a 16 second delay between when Binary 'phantomjs' Exists: true and SELENIUM STAND-ALONE EXECUTABLE DOWNLOADS COMPLETE are printed.

This delay occurs even if all drivers have already been downloaded and extracted (meaning, if you run this plugin twice in a row the second time runs just as slow as the first even though the output files do not need to be changed).

Please investigate where this 16 second delay is coming from, and try to avoid doing any work if the output files will not be changed (meaning, the source versions did not change and the output files already exist).

Ardesco commented 9 years ago

can you provide the plugin entry in your POM so I can see what settings you have configured.

Ardesco commented 9 years ago

Ignore that, you already have :)

cowwoc commented 9 years ago

@Ardesco On a side-note, removing phantomjs from my repository map causes the problem to go away. So this problem definitely had something to do with phantomjs. Here is the original repository map:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
    <windows>
        <driver id="googlechrome">
            <version id="2.12">
                <bitrate thirtytwobit="true" sixtyfourbit="true">
                    <filelocation>http://chromedriver.storage.googleapis.com/2.12/chromedriver_win32.zip</filelocation>
                    <hash>51eb47ad5ea91422aa1aaa400a724e7b</hash>
                    <hashtype>md5</hashtype>
                </bitrate>
            </version>
        </driver>
        <driver id="phantomjs">
            <version id="1.9.7">
                <bitrate thirtytwobit="true" sixtyfourbit="true">
                    <filelocation>https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-windows.zip</filelocation>
                    <hash>3c70fdfba7766aa88357f387af222166c48854eb</hash>
                    <hashtype>sha1</hashtype>
                </bitrate>
            </version>
        </driver>
    </windows>
    <linux>
        <driver id="googlechrome">
            <version id="2.12">
                <bitrate sixtyfourbit="true">
                    <filelocation>http://chromedriver.storage.googleapis.com/2.12/chromedriver_linux64.zip</filelocation>
                    <hash>f306b93ff1b34af74371cee87d6560e4</hash>
                    <hashtype>md5</hashtype>
                </bitrate>
                <bitrate thirtytwobit="true">
                    <filelocation>http://chromedriver.storage.googleapis.com/2.12/chromedriver_linux32.zip</filelocation>
                    <hash>6f4041e7f8300380cc2a13babbac354e</hash>
                    <hashtype>md5</hashtype>
                </bitrate>
            </version>
        </driver>
        <driver id="phantomjs">
            <version id="1.9.7">
                <bitrate sixtyfourbit="true">
                    <filelocation>https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-linux-x86_64.tar.bz2</filelocation>
                    <hash>ca3581dfdfc22ceab2050cf55ea7200c535a7368</hash>
                    <hashtype>sha1</hashtype>
                </bitrate>
                <bitrate thirtytwobit="true">
                    <filelocation>https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-linux-i686.tar.bz2</filelocation>
                    <hash>98005ed0b964502b6dea2ed4fdf9b593eb6fbead</hash>
                    <hashtype>sha1</hashtype>
                </bitrate>
            </version>
        </driver>
    </linux>
    <osx>
        <driver id="googlechrome">
            <version id="2.12">
                <bitrate thirtytwobit="true" sixtyfourbit="true">
                    <filelocation>http://chromedriver.storage.googleapis.com/2.12/chromedriver_mac32.zip</filelocation>
                    <hash>259bb87f4ebf3b0bc4792ed203bd69f5</hash>
                    <hashtype>md5</hashtype>
                </bitrate>
            </version>
        </driver>
        <driver id="phantomjs">
            <version id="1.9.7">
                <bitrate thirtytwobit="true" sixtyfourbit="true">
                    <filelocation>https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.7-macosx.zip</filelocation>
                    <hash>519e53cc612a57cb1c82a0cbf028e7e4bb4ceeec</hash>
                    <hashtype>sha1</hashtype>
                </bitrate>
            </version>
        </driver>
    </osx>
</root>
Ardesco commented 9 years ago

My first guess is relative file size. The phantomjs zip files are about 4 times the size of the other zip files. I notice you also have shall hashes of the phantoms zip files and md5 hashes of the googlechrome ones, so that could be another option.

I'll have to do some investigation to fine out what the bottleneck is, i'm using standard libraries to do the hash check though so it would probably be trying pother libraries to see if they are more efficient.

Ardesco commented 9 years ago

The problem is that i'm always extracting the executable from the zip file. The phantomjs tar.gz is larger than the other binaries so it takes longer to extract.

I'm going to need to add some logic that is aware of the filename of the executable and checks for its existence before unzipping. That should probably be configurable so that if the executable names change in the future it doesn't need a new version of the plugin to fix things.

cowwoc commented 9 years ago

Just compare the timestamp of the source file to the timestamp of the target files. If the target is the same or newer as the source, don't re-extract.