MobilityData / gtfs-validator

Canonical GTFS Validator project for schedule (static) files.
https://gtfs-validator.mobilitydata.org/
Apache License 2.0
275 stars 100 forks source link

`-input` flag deletes existing files #114

Closed irees closed 4 years ago

irees commented 4 years ago

Describe the bug I mistakenly used -input to refer to my zip file, instead of -zip, which results in the file specified by -input to be immediately deleted.

To Reproduce

(env) mbp:test irees$ ls -ldhF ok.zip
-rw-r--r--  1 irees  staff    38M Apr 15 14:32 ok.zip

(env) mbp:test irees$ java -jar gtfs-validator.jar -i ok.zip
[INFO ] 2020-04-15 14:32:53.767 [main] Main - --url and relative path to zip file(--zip option) not provided. Trying to find zip in: /Users/irees/test
[INFO ] 2020-04-15 14:32:53.785 [main] Main - zip file found: /Users/irees/test/ok.zip
[INFO ] 2020-04-15 14:32:53.786 [main] Main - --output not provided. Will place execution results in: /Users/irees/test/output
[INFO ] 2020-04-15 14:32:53.786 [main] Main - Unzipping archive
[ERROR] 2020-04-15 14:32:53.801 [main] Main - An exception occurred: /Users/irees/test/ok.zip (Is a directory)
[INFO ] 2020-04-15 14:32:53.804 [main] Main - Took 987ms

(env) mbp:test irees$ ls -ldhF ok.zip
drwxr-xr-x  2 irees  staff    64B Apr 15 14:32 ok.zip/

Expected behavior Fail when -input path exists.

Witnessed behavior See above

Screenshots See above

Environment used

irees commented 4 years ago

A couple other notes: 1. it runs the delete before checking if the -zip path is present or valid gtfs 2. it's a recursive delete of the entire tree.

lionel-nj commented 4 years ago

Thanks for the observation @irees, MobilityData will be exploring this.

ghost commented 4 years ago

Hey @irees , -input path will exist if the validator has been run previously. Not sure we should fail in that case?

barbeau commented 4 years ago

@fabrice-v I would suggest considering re-naming the -input to something different, as I think it's confusing to users. I think most end users would view what it does (serve as unzipped location of a GTFS zip file) as a cache or working directory, not as input to the application. The user-facing "input" is really the -zip param.

ghost commented 4 years ago

That is a good point indeed. Shall we keep the possibility to specify where said cache is created which was the original intent or simplify it to something like 'an input folder will be created with the content of the provided dataset' ?

barbeau commented 4 years ago

@fabrice-v I think it's a good idea to keep the cache location configurable via command line, as these files can get huge and users may not want to unzip to the same drive as where they are running the application.

ghost commented 4 years ago

Alright here is how I propose to move forward

barbeau commented 4 years ago

SGTM!

lionel-nj commented 4 years ago

Alright here is how I propose to move forward

  • rename -zip to -inputzip
  • rename -input to -extract Thoughts?

This had been implemented in d55683b78aa905bf0b8ad186ae66a4483898928a, see linked PR

barbeau commented 4 years ago

@lionel-nj Could you update the release docs in that PR too? https://github.com/MobilityData/gtfs-validator/blob/master/RELEASE.md

lionel-nj commented 4 years ago

Sure, working on it.

ghost commented 4 years ago

Note that in the final version of the fix -zip was renamed to -zipinput