LibraryOfCongress / bagger

The Bagger application packages data files according to the BagIt specification.
Other
120 stars 19 forks source link

Difference in validation status of bag using Python and Java versions #50

Closed johnscancella closed 6 years ago

johnscancella commented 6 years ago

from user https://github.com/jamiepb

I've discovered a discrepancy between the Java version and the Python version when validating a bag with an extra file. I have a bag that acquired a thumbs.db file when someone opened it. I ran a tool that is built on bagit-python (current pip install version, 1.6.2) that did not validate the bag, noting that the expected file count and bag size did not match. I then investigated and saw the thumbs.db file that had appeared after bagging. However, I then ran Bagger 2.7.6 and the bag validated.

Software and OS:

BagIt Python 1.6.2 and Bagger 2.7.6 Windows 7 Enterprise Given

A bag with a thumbs.db file in it that is not on the manifest When

I run a Python tool that calls bagit py 1.6.2 Then

The bag does not validate BUT:

Given

A bag with a thumbs.db file in it that is not on the manifest When

I open the bag in Bagger 2.7.6 and validate Then

The validation is successful Thank you for looking into it.

johnscancella commented 6 years ago

@jamiepb can you attach a sample bag that has this problem? I don't have a windows XP handy to readily create a thumbs.db file

jamiepb commented 6 years ago

Hello,

I zipped up a copy of the bag that has this problem, but it’s too large to attach. And strangely I can’t reproduce the problem – I tried making a new bag and opening the images within it to create a thumbs.db file, and that bag did not validate. So I have no idea what’s going on with the original bag in question.

Jamie

From: John Scancella [mailto:notifications@github.com] Sent: Friday, December 22, 2017 10:45 AM To: LibraryOfCongress/bagger bagger@noreply.github.com Cc: Patrick-Burns, Jamie A jamie.patrickburns@ncdcr.gov; Mention mention@noreply.github.com Subject: [External] Re: [LibraryOfCongress/bagger] Difference in validation status of bag using Python and Java versions (#50)

CAUTION: External email. Do not click links or open attachments unless verified. Send all suspicious email as an attachment to report.spam@nc.gov.

@jamiepbhttps://github.com/jamiepb can you attach a sample bag that has this problem? I don't have a windows XP handy to readily create a thumbs.db file

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/LibraryOfCongress/bagger/issues/50#issuecomment-353622565, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AhKazlADIwoyNe_EF6b1-Aqs3OehPn9Lks5tC851gaJpZM4RLJDN.

johnscancella commented 6 years ago

If you are able to create a smaller sample bag that reproduces the error I will take a look, otherwise I can't troubleshoot this.

johnscancella commented 6 years ago

closing from lack of response