Open jerrygreen opened 5 years ago
Did you ever find a fix for this?
@AzureZhen I've found macos is using an util called ditto
to zip/extract something. It's a default macos app – so I simply used this for extraction, works perfectly
@JerryGreen , I am facing same issue. How do you fix it by using ditto? could you please attach sample code here?
Many thanks.
@kangwen6663
ditto -xk /path/from /path/to
This error is thrown cases when file comment field exceeds maximum 65k size. I have seen it with some external signing schemes.
Anyone still experiencing the issue? Would appreciate a solution that is not dependent
on ditto
if possible.
const zipPath = `./temp/file.zip`;
const zip = new AdmZip(zipPath);
zip.extractAllTo(`./temp/`, true);
The terminal output is:
Error: Invalid or unsupported zip format. No END header found
at readMainHeader (/Users/username/Projects/example-project/node_modules/adm-zip/zipFile.js:107:10)
at new module.exports (/Users/username/Projects/example-project/node_modules/adm-zip/zipFile.js:19:3)
at new module.exports (/Users/username/Projects/example-project/node_modules/adm-zip/adm-zip.js:20:11)
at module.exports._installWordpress (/Users/username/Projects/example-project/generators/app/index.js:102:17)
at module.exports.writing (/Users/username/Projects/example-project/generators/app/index.js:45:12)
at Object.<anonymous> (/Users/username/Projects/example-project/node_modules/yeoman-generator/lib/index.js:399:25)
at /Users/username/Projects/example-project/node_modules/run-async/index.js:49:25
at new Promise (<anonymous>)
at /Users/username/Projects/example-project/node_modules/run-async/index.js:26:19
at /Users/username/Projects/example-project/node_modules/yeoman-generator/lib/index.js:400:11
Are you able extract or view your file with archive managers like 7zip, WinRAR etc. ?
Are you able extract or view your file with archive managers like 7zip, WinRAR etc. ?
@5saviahv Yes, the file seems to be okay and extracts with these tools I've tried:
I also seem to only get this error sometimes, not all the time (using the same file) which makes it slightly more challenging to understand the cause.
Thanks!
Interesting, there maybe many culprits, but in way you describe it, it seems like race condition (two or more process wants access the file at same time)
const zipPath = `./temp/file.zip`;
const zip1 = new AdmZip(zipPath);
const zip2 = new AdmZip(zipPath);
Interesting, there maybe many culprits, but in way you describe it, it seems like race condition (two or more process wants access the file at same time)
- Do you only read from file or you also write in this file ?
- Do you open file multiple times ?
const zipPath = `./temp/file.zip`; const zip1 = new AdmZip(zipPath); const zip2 = new AdmZip(zipPath);
- Do you use async functions ?
- It fails on which OS ?
Thanks for getting back to me on this. A race condition could be the case, after the file is extracted the contents are copied and then the original zip file and extracted folder are removed.
Do you use async functions?
I am unsure but I have provided an example below.
It fails on which OS?
It fails when running the script on Node v15.2.1
or v14.15.1
running on macOS Big Sur 11.0.1 (20B29).
Here's the function I have:
_extractZip(projectName, fileZipName, copyPath=null) {
const extractedFolder = `./${projectName}/${fileZipName.replace('.zip', '')}`;
// Extracts contents of zip file.
const extractPath = `./${projectName}/${fileZipName}`;
const zip = new AdmZip(extractPath);
zip.extractAllTo(`${projectName}/`, true);
let extractError = null;
// If a copy path is not provided files won't be moved.
if (copyPath) {
fse.copy(extractedFolder, copyPath, { overwrite: true }, err => {
if (err) {
extractError = `
Could not copy files to ./${copyPath}. \n
./${err}
`
} else {
// Cleans up by removing extracted folder and zip.
try {
fs.rmdirSync(extractedFolder, { recursive: true });
} catch (err) {
extractError = `
Could not remove extractedFolder. \n
./${err}
`
}
// Remove zip file as it is not longer needed.
try {
fs.unlinkSync(extractPath);
} catch (error) {
extractError = `
Could not remove ./${extractPath}. \n
./${err}
`
}
}
});
} else {
// Cleans up by removing extracted folder and zip.
// Lets user know that program did not work as intended.
try {
fs.rmdirSync(extractedFolder, { recursive: true });
} catch (err) {
extractError = `
Could not remove extractedFolder. \n
./${err}
`
}
// Remove zip file as it is not longer needed.
try {
fs.unlinkSync(extractPath);
} catch (error) {
extractError = `
Could not remove ./${extractPath}. \n
./${err}
`
}
this.log(`${chalk.red('Error:')} Could not copy files (copyPath is not present).
Zip file and extracted files were removed.`);
}
}
Thanks again for looking into it. If this is a race condition is there a way to only access the file when it is done extracting?
Code seems ok. It should not give any trouble.
How big your files are ? I mean aren't any of them Zip64 ? Many archive managers switch Zip64 if you use big files or you have many files. Adm-zip can read zip64 files but it has higher chance for fail.
@5saviahv thank you, for some reason I cannot replicate the error lately. I am not sure whether it is Zip64
but one of the files I have used this function on is wordpress.zip
.
Since this is so intermittent (on that same file) I am unsure what might be cause it but I haven't been getting the error lately. I really appreciate your help, I will comment here again if it returns and will add details. 🙏
This happened to me once, but when retrying (same code, same file) it works. Weird.
I have this same issue. I think I can somehow replicate it.
I download the file from my aws-s3 then use adm to unzip then use it with cheerio.
The trick is I need to leave my computer alone for like 5-10 minutes and run my code and it will sometime (around 40% of the time) give the error "Invalid or unsupported zip format. No END header found".
But otherwise it will work fine. The file is epub always same file so it this file is usable.
Here is my code.
getfile();
async function getfile(){
try {
aws.config.update({
accessKeyId: accessKeyId,
secretAccessKey: secretAccesskey,
region: 'us-east-2'
});
var s3 = new aws.S3();
var params = {
Bucket: 'original',
Key: 'file.epub'
};
let readStream = s3.getObject(params).createReadStream();
let writeStream = fs.createWriteStream(path.join(__dirname, params.Key));
readStream.pipe(writeStream);
readStream.on('end', () => {
console.log("this ends")
console.log("paramkey = ",params.Key)
writeStream.end();
epubToText(params.Key);
})
} catch (error) {
console.log("error = ",error)
}
}
async function epubToText(path2) {
try {
console.log("path2 = ",path2) // always already exists
console.log("111111")
let zip = new AdmZip(path2); // it stops here. In console it only logged "111111" and not "22222"
console.log("22222")
let $ = cheerio.load( zip.readAsText('META-INF/container.xml'),{xmlMode:true, decodeEntities: false} );
console.log("$ = ",$)
let contentOpfPath = $("container rootfiles rootfile").attr("full-path");
console.log("contentOpfPath = ",contentOpfPath)
let contentOpfFolder = contentOpfPath.split("/")
console.log("contentOpfFolder = ",contentOpfFolder)
} catch(err) {
console.log(err);
}
I can't let this happen in production though. The file must be processed and served to customer. This file is 6.5 Mb I use node 12.16.1 on windows 7 and on my mac bigsur this happens too.
I made a workaround for this as I was getting the zip file externally and then saved locally to extract.
A setTimeout solved my issue.
const saveZIPFile = async () => {
return new Promise((resolve) => {
data.body.pipe(fs.createWriteStream(path.resolve(__dirname, `${project}.zip`)));
data.body.on('end', () => {
setTimeout(() => {
resolve();
}, 1000);
});
});
};
await saveZIPFile();
var zip = new AdmZip(path.resolve(__dirname, `${project}.zip`));
zip.extractAllTo(path), true);
I had the same problem and it turned out to be an issue with how I download the file. I never waited for the download to complete before attempting to unzip it.
Solution: Properly await the download and only then start working with the ZIP file.
I had this problem, and turns out the URL was returning a http code 302 (redirect), instead of a 200 (success). then my zip file was getting 0 bytes.
to fix that I changed the code a bit:
const url =
'http://blablabla.zip';
const zipFile = './blablabla.zip';
const zipFileStream = fs.createWriteStream(zipFile);
function downloadFile(url, attempt = 1) {
return new Promise((resolve, reject) => {
https
.get(url, (res) => {
if (res.statusCode === 302 || res.statusCode === 301) {
if (attempt > 5) {
// prevent infinite loops if there's a redirect loop
reject(new Error('Too many redirects'));
return;
}
const newUrl = res.headers.location;
console.log(`Redirecting to: ${newUrl}`);
downloadFile(newUrl, attempt + 1).then(resolve, reject);
return;
}
if (res.statusCode !== 200) {
reject(new Error(`Unexpected status code: ${res.statusCode}`));
return;
}
res.pipe(zipFileStream);
zipFileStream.on('finish', () => {
zipFileStream.close(resolve);
});
zipFileStream.on('error', (error) => {
reject(error);
});
})
.on('error', (error) => {
reject(error);
});
});
}
await downloadFile(extensionUrl);
// eslint-disable-next-line no-console
console.log('Download Completed extracting zip...');
const zip = new admZip(zipFile); // eslint-disable-line new-cap
zip.extractAllTo('./blablabla', true);
// eslint-disable-next-line no-console
console.log('zip extracted');
My issue here was that I passed .DS_Store
when attempting to unzip. Ensure you're filtering out invalid file paths.
I had the same problem downloading a file using axios
and fs.createWriteStream
.
I solved by waiting on the writeStream' close
event and then resolving the Promise
:
import os from 'os'
import fs from 'fs';
import path from 'path';
import axios from 'axios';
import AdmZip from 'adm-zip';
async function download(url: string): Promise<string> {
const outputFile = path.join(os.tmpdir(), 'archive.zip');
const { data } = await axios.get(url, { responseType: 'stream' });
// Pipe the data to a file
const writeStream = fs.createWriteStream(outputFile);
data.pipe(writeStream);
// Return a promise and resolve when download finishes
return new Promise((resolve, reject) => {
data.on('error', () => {
reject(`Failure while retrieving remote data (source: ${downloadURL})`);
})
writeStream.on('close', () => {
resolve(outputFile);
})
writeStream.on('error', err => {
reject(err);
})
})
}
async function extract(url: string, outputDir: string) {
// 1. Download the zip file
const file = await download(url);
// 2. Extract the archive
const zip = new AdmZip(file);
zip.extractAllTo(outputDir, /*overwrite*/ true);
return outputDir;
}
extract("https://example.com/archive.zip", path.join(os.tmpdir(), 'extracted')).catch((error) => {
console.error(error);
});
In my case it was my fault with curl and the way I was passing my binary.
I should have passed binary @/Users/gsaukov/Downloads/index.zip
but i was passing path /Users/gsaukov/Downloads/index.zip
as string.
Correct binary curl version (with @) below:
curl -X PUT \
-H 'Content-Type: application/zip' \
-H 'accept: application/json' \
--insecure \
--data-binary @/Users/gsaukov/Downloads/index.zip \
https://localhost:3000/artifact
I found the problem for me: Wrong ❌:
const outputFilePath = "./ExtractTextInfoFromPDF.zip";
console.log(`Saving asset at ${outputFilePath}`);
const writeStream = fs.createWriteStream(outputFilePath);
streamAsset.readStream.pipe(writeStream);
let zip = new AdmZip(outputFilePath); // Wrong ❌
let jsondata = zip.readAsText('structuredData.json');
let data = JSON.parse(jsondata);
console.log("data", data);
data.elements.forEach((element: any) => {
if (element.Path.endsWith('/H1')) {
console.log(element.Text);
}
});
Right ✅:
const outputFilePath = "./ExtractTextInfoFromPDF.zip";
let zip = new AdmZip(outputFilePath); // Right ✅ !!!!
console.log(`Saving asset at ${outputFilePath}`);
const writeStream = fs.createWriteStream(outputFilePath);
streamAsset.readStream.pipe(writeStream);
let jsondata = zip.readAsText('structuredData.json');
let data = JSON.parse(jsondata);
console.log("data", data);
data.elements.forEach((element: any) => {
if (element.Path.endsWith('/H1')) {
console.log(element.Text);
}
});
I think, we have to create a new AdmZip()
before writing in it
maybe it's why i get a INVALID_FORMAT()
in the adm-zip/zipFile.js
Error:
My (simple) code:
I'm using
Macos 10.14.1
By opening it from Finder (using default Archive Utility app) it's unzipping nicely, no problems