Closed mkalkbrenner closed 1 year ago
Did you set sendHttpHeaders: false,
in the constructor? See https://maennchen.dev/ZipStream-PHP/guide/Options.html
Setting sendHttpHeaders
to true or false doesn't change anything.
As mentioned above, zipstream just generates the zip file:
$file = fopen($file_name, 'w+b');
$zip = new ZipStream(outputStream: $file, defaultEnableZeroHeader: FALSE);
...
foreach ($files as $name => $content) {
$zip->addFile($name, $content);
}
$zip->finish();
fclose($file);
This zip file then gets uploaded embedded into a Solr command using curl.
@mkalkbrenner Can you open the archive with another unarchiving tool or is it corrupted completely?
@maennchen as metioned above, the archive could be opened using other tools. And the files in it are readable and not corrupted in any way.
@mkalkbrenner Sorry, you already wrote that.
Would you be able to share both a v2 and a v3 file with the same contents so we can compare?
I'll prepare some files ...
I can reproduce it using curl on the command line:
curl -X POST --header "Content-Type:application/octet-stream" --data-binary @configset_v3.0.2_zeroHeader.zip "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=myConfigSet1"
{
"responseHeader":{
"status":500,
"QTime":13},
"error":{
"msg":"invalid entry size (expected 0 but got 24553 bytes)",
...
curl -X POST --header "Content-Type:application/octet-stream" --data-binary @configset_v3.0.2.zip "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=myConfigSet2"
{
"responseHeader":{
"status":500,
"QTime":11},
"error":{
"msg":"invalid entry size (expected 4294967295 but got 24553 bytes)",
...
curl -X POST --header "Content-Type:application/octet-stream" --data-binary @configset_v2.4.0.zip "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=myConfigSet3"
{
"responseHeader":{
"status":0,
"QTime":83}}
I tried the main branch now and the new function addFileFromCallback()
and provided a self-calculated exactSize
for every file.
The resulting zip file could be read by different zip clients. But Java (Solr) still doesn't accept it. But the error message is different:
curl -X POST --header "Content-Type:application/x-zip" --data-binary @configset_v3.1.0beta_callback_exactSize.zip "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=myConfigSet7"
{
"responseHeader":{
"status":500,
"QTime":12},
"error":{
"msg":"invalid stored block lengths",
In order to unblock a Drupal 10.1.x compatible release of https://www.drupal.org/project/search_api_solr , can we do a 2.4.1 release including https://github.com/maennchen/ZipStream-PHP/pull/268 ? That would also solve dependency issues for other modules at the moment, for example https://github.com/PHPOffice/PhpSpreadsheet doesn't have a v3.0.2 compatible release yet.
There is for sure something fishy going on:
v2.4.0 zip
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19|0123456789abcdef0123456789|.{}: <stdin> (zip)
0x00000|50 4b 03 04 14 00 00 00 08 00 5c 7b ce 56 63 ba 00 70 53 05 00 00 e9 5f 00 00|PK........\{.Vc..pS...._..| local_files[0:27]:
* |until 0x21858.7 (137305) | |
0x21840| 50| P| central_directories[0:27]:
0x2185a|4b 01 02 03 06 14 00 00 00 08 00 5c 7b ce 56 63 ba 00 70 53 05 00 00 e9 5f 00|K..........\{.Vc..pS...._.|
* |until 0x21eee.7 (1686) | |
0x21eda| 50 4b 05 06 00| PK...| end_of_central_directory_record{}:
0x21ef4|00 00 00 1b 00 1b 00 96 06 00 00 59 18 02 00 00 00| |...........Y.....| |
v3.0.2 zip
|00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19|0123456789abcdef0123456789|.{}: <stdin> (zip)
| | | error: zip: RawLen(compressed): failed at position 56 (read size 0 seek pos 0): outside buffer
0x00000|50 4b 03 04 2d 00 00 08 08 00 0a 7c ce 56 63 ba 00 70 ff ff ff ff ff ff ff ff|PK..-......|.Vc..p........| local_files[0:1]:
* |until 0x37.7 (56) | |
0x00034| ed 5c 6d 4f e3 38 10 fe ce af c8 e5 3b 50 56 b7 77 5f e8 4a 7b 85| .\mO.8......;PV.w_.J{.| gap0: raw bits
0x0004e|22 74 c0 56 c0 8a 93 8e d3 ca 24 d3 d6 5a c7 ce 39 0e d0 fb f5 37 76 5e 80 b4|"t.V......$..Z..9....7v^..|
* |until 0x2190d.7 (137430) | |
0x218f6| 50 4b| PK| central_directories[0:27]:
0x21910|01 02 03 06 2d 00 00 08 08 00 0a 7c ce 56 63 ba 00 70 53 05 00 00 e9 5f 00 00|....-......|.Vc..pS...._..|
* |until 0x21fa3.7 (1686) | |
0x21f90| 50 4b 05 06 00 00| PK....| end_of_central_directory_record{}:
0x21faa|00 00 1b 00 1b 00 96 06 00 00 0e 19 02 00 00 00| |................|
I'm still investigating what's going on.
Ok, the v3.0.2
zip contains a 0x5653
header (ExtendedInformationExtraField
in ZipStream), maybe solr chokes on that.
That field should only be enabled if either a UTF-8 Name, UTF-8 Comment or a Zero Header is active. Neither makes sense for your file.
| | | extra_fields[0:1]:
| | | [0]{}: extra_field
0x00034|53 56 |SV | header_id: 0x5653
0x00034| 00 00 | .. | data_size: 0
| | | data: raw bits
Can you try commenting out lines 183-186 in File.php
?
(for anyone trying to replicate my investigation: I'm using fq
which is extremely helpful to debug binary files.)
Ah, found it, should've seen that directly.
Try setting enableZip64: false
when creating the ZipStream
instance.
enableZip64: false, defaultEnableZeroHeader: false
This combination made it work. Thanks for your support!
But I must admit, that this API is less intuitive as before. And it might make sense to add some hints in the migration section of the README.
@mkalkbrenner One of the issues of this lib is that error handling is quite hard since 99% of the users are sending a browser reaponse and the error is just written into the zip file.
Therefore it made sense to enable Zip64 by default since it can’t be done halfway through and would result in an overflow.
Similarly, a lot of users try to stream out file contents and the zero option is what makes this efficient.
This choice should be documented properly however, both in the Migration information as well as in the API docs.
A PR for that would be very welcome.
search_api_solr uses the stream functionality as well when users are downloading a Solr config-set via the user interface instead of using the drush command line tool.
Some hosters or Solr service providers offer a possibility to upload and deploy these zip files manually to their service and they deploy them. I assume that this will fail then, too.
Therefor I added enableZip64: false, defaultEnableZeroHeader: false
to the streaming part as well.
@mkalkbrenner FYI: PhpSpreadsheet just released a version compatible with v3.
@mkalkbrenner On further inspection, Zip64 was already enabled in v2. The difference is that the extra headers were only used if zeroHeader && zip64
while v3 was doing zeroHeader || zip64
and therefore added the header even if it was not needed.
Can you verify if it now works as well with #269?
Given that, I think the migration instructions (https://github.com/maennchen/ZipStream-PHP#archive-options) are clear enough and this ticket can be closed.
ZipStream-PHP version
3.0.2
PHP version
8.1.16
Constraints for Bug Report
[X] I'm using a version of ZipStream that is currently supported: https://github.com/maennchen/ZipStream-PHP#version-support
[X] I'm using a version of PHP that has active support: https://www.php.net/supported-versions.php
[X] I'm using a version of PHP that is compatible with your used ZipStream version.
[X] I'm using the latest release of the used ZipStream major version.
Summary
Drupal's Search API Solr module uses zipstream to upload config-set zip files to Solr (or ZooKeeper). It perfectly worked with 1.x and 2.x versions for years. But it doesn't work anymore with 3.0.2 or the main branch. zipstream is not used to do HTTP, but to write a zip file to disk. This zip file then gets uploaded using https://github.com/solariumphp/solarium The zip file itself seems to be ok and could be opened and extracted using other zip clients. But the client built into Java seems to have an issue with files created with zipstream 3.x.
Current behavior
The issue seems to be about the file size or the size reported within the zip file:
java.util.zip.ZipException: invalid entry size (expected 0 but got 24553 bytes)
If I set
defaultEnableZeroHeader: false
, the error message changes, but still exists:java.util.zip.ZipException: invalid entry size (expected 4294967295 but got 24553 bytes)
How to reproduce
drush search-api-solr:upload-configset -v --numShards=1
Expected behavior
The zip files produced with 2.4.0 still work. 3.0.x should be able to produce zip files that could be parsed by java.util.zip