This PR adds auto-generated differential tests to compare the output of PolyFile against file/libmagic using 900+ files from Ange Albertini's Corkami corpus.
This has revealed several bugs both in PolyFile and libmagic, the former of which have been fixed in this PR. In particular, PolyFile's handling of libmagic regular expressions was faulty.
This PR also includes several improvements to PolyFile's interactive debugger, which were implemented in order to investigate the differentials.
Out of the 942 files in the Corkami corpus, PolyFile now matches at least as many MIME types as file for all but two files. One of those discrepancies is due to an incorrect classification on the part of file, and the other discrepancy is due to an incorrect classification on the part of PolyFile.
This PR adds auto-generated differential tests to compare the output of PolyFile against
file
/libmagic
using 900+ files from Ange Albertini's Corkami corpus.This has revealed several bugs both in PolyFile and
libmagic
, the former of which have been fixed in this PR. In particular, PolyFile's handling oflibmagic
regular expressions was faulty.This PR also includes several improvements to PolyFile's interactive debugger, which were implemented in order to investigate the differentials.
Out of the 942 files in the Corkami corpus, PolyFile now matches at least as many MIME types as
file
for all but two files. One of those discrepancies is due to an incorrect classification on the part offile
, and the other discrepancy is due to an incorrect classification on the part of PolyFile.