fossas / fossa-cli

Fast, portable and reliable dependency analysis for any codebase. Supports license & vulnerability scanning for large monoliths. Language-agnostic; integrates with 20+ build systems.
https://fossa.com
Other
1.27k stars 174 forks source link

Vanilla tar #1452

Closed csasarak closed 3 months ago

csasarak commented 3 months ago

Overview

For a while now we've been using a custom fork of tar due to some previous issues we had with some container tarballs. In the meantime, many changes and improvements have been added to haskell/tar such that I think it's valid for us to switch back to upstream.

This change also seems to fix a few other issues that we had as well: CLI-side license scan of archive uploads with long filename fails Binary File Error

As well as an unticketed recursive symlink issue.

Acceptance criteria

Testing plan

Chris test:

To test, I looked at the various issues that caused us to fork in the first place.

The first one is #1305 where we were getting a NoTarFormat error. I tested by downloading the image from the linked Jira ticket and running cabal run fossa -- container analyze -o <file> off of this branch. Everything completed successfully. For extra verification, I built an old version of fossa from before #1305 and found that that was still failing so the test case was valid.

The next ticket which touched this was ANE-689. ~Unfortunately, the example tar file linked there doesn't work.~ EDIT: I was able to get a copy of that image from our support staff and it works! Running cabal run fossa -- container analyze -o public.ecr.aws/lambda/python:3.9 does seem to work correctly.

In order to test the PAX support further, I used some of the tests from our fork of haskell/tar. I did this by porting the fork's tests over to the release branch of upstream tar. You can see that work here. Many of the tests were checking for equality between read (the upstream read) and read', the function from the test branch, those can be safely ignored. The more important tests are the ones which correctly read pax headers here. I found that all of the tests worked as expected.

Scott tests:

For the following tests, I downloaded the build from this PR and put it in ~/tmp/fossa-upstream-tar. Then I tested with current fossa and fossa-upstream-tar

Test that you can have a vendored dependency with a recursive symlink

mkdir -p root/CCOSS/openssl/abc
ln -s ../../openssl root/CCOSS/openssl/abc/openssl

tree root
root
└── CCOSS
    └── openssl
        └── abc
            ├── foo.txt
            └── openssl -> ../../openssl
ls -l root/CCOSS/openssl/abc/openssl/
total 0
drwxr-xr-x  4 scott  staff  128 Jul 16 11:53 abc

fossa-deps.yml:

vendored-dependencies:
  - path: 'root'
    name: everything

With current fossa, I see this error:

ERROR] An issue occurred

  *** Relevant Errors ***

      Error: An exception occurred:/private/var/folders/45/k76d2pgj6mlb1vpx_2gyl7z80000gn/T/fossa-temp-ab37d3a16c0ed4ab/rootdb47cf13-6279-47c0-a935-82ed17166a4d: withBinaryFile: user error (File name too long (cannot split))

With fossa from this branch, it works

Test that you can have a vendored dependency with a broken symlink

mkdir vendored
cd vendored
touch foo
ln -s bar foo
rm foo
cd ..
ls -l vendored
total 0
lrwxr-xr-x  1 scott  staff     3B Jul 17 00:14 bar -> foo

fossa-deps.yml like this:

vendored-dependencies:
  - name: vendored
    path: vendored

With current fossa, I see this error:

 *** Relevant Errors ***

      Error: An exception occurred:/private/var/folders/45/k76d2pgj6mlb1vpx_2gyl7z80000gn/T/fossa-temp-40c49bcc82fd5c28/.003d7652-9845-4847-af5c-86d0c4780dab: withBinaryFile: does not exist (No such file or directory)

With fossa from this branch it works.

Test that you can have an vendored dependency with a long filename

mkdir vendored/lkajdlaksadjaslkjdllaskjdsalkdjsalkdjaslkdjsalkdjaslkjdlaskjdalskjdlaskjdalskjdaslkdalskdjlakdjalskdjlkasdjaslkdjalksdjlaksjdlkasdlkasjdlkasdlkasdjasldkjaslkdjaslkdjlkasjdklasdlkaskljdlakjsd
cp $themis/license-data/licenses/mit.LICENSE vendored/lkajdlaksadjaslkjdllaskjdsalkdjsalkdjaslkdjsalkdjaslkjdlaskjdalskjdlaskjdalskjdaslkdalskdjlakdjalskdjlkasdjaslkdjalksdjlaksjdlkasdlkasjdlkasdlkasdjasldkjaslkdjaslkdjlkasjdklasdlkaskljdlakjsd

fossa-deps.yml like this:

vendored-dependencies:
  - name: vendored
    path: vendored

With current fossa, I see this error:

  *** Relevant Errors ***

      Error: An exception occurred:/private/var/folders/45/k76d2pgj6mlb1vpx_2gyl7z80000gn/T/fossa-temp-1a6d5e8f8b72d1b0/vendoredb13eab1a-50c9-48e1-a01c-19d4ed2ef555: withBinaryFile: user error (File name too long)

With fossa from this branch it works

Test that revisions do not change

add a fossa-deps.yml with a vendored dependency in it to some project. Make sure that there is no version specified on the vendored dependency. Run fossa analyze and fossa-upstream-tar analyze. Check that the revision of the vendored dependency does not change.

Made with fossa analyze: https://app.fossa.com/projects/custom%2B24987%2Fsimple-vendored/refs/branch/master/2024-07-17T22%3A22%3A52Z/browse/dependencies

Made with fossa-upstream-tar analyze: https://app.fossa.com/projects/custom%2B24987%2Fsimple-vendored/refs/branch/master/2024-07-17T22%3A22%3A52Z/browse/dependencies

In both cases the version of the dependency is 31401800367aa098e1fd00651d5058dd

Risks

We could cause a regression in our tar support. I don't see anything that indicates to me that upstream tar specifically addressed the bugs we did, I've only verified that these work empirically for the test cases I was able to find.

References

Checklist

csasarak commented 3 months ago

I was able to get a copy of the image from ANE-689 and it looks like it works on this branch. So I'm pretty confident that this is safe.