genome / qc-analysis-pipeline

Workflow used for WGS/WES data QC
BSD 3-Clause "New" or "Revised" License
7 stars 8 forks source link

Excessive `tar` Warnings #54

Open jasonwalker80 opened 3 years ago

jasonwalker80 commented 3 years ago

Presumably on this line: https://github.com/genome/qc-analysis-pipeline/blob/cb05da7f19972db6189b37a1ef9de6561fc0e83a/tasks/Qc.wdl#L302

I'm seeing thousands of warning messages like:

tar: Ignoring unknown extended header keyword 'SCHILY.dev'
tar: Ignoring unknown extended header keyword 'SCHILY.ino'
tar: Ignoring unknown extended header keyword 'SCHILY.nlink'
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.creationtime'

This can be reproduced using the NA12878 CRAM in the (soon to be a PR) SingleSampleQC.json: gs://broad-public-datasets/NA12878/NA12878.cram

The culprit is likely the ref_cache tar archive: gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.ref_cache.tar.gz

jasonwalker80 commented 3 years ago

@aofarrel pointed out this: https://github.com/nodejs/node/issues/22805