Closed thbar closed 1 year ago
Thank you for testing out the tool, and the detailed feed back. It was very helpful, with comments and examples on how to reproduce the errors. We have answered in each separate issue.
We are working on an updated documentation, all sections will be updated during the summer, including your suggestions.
Many thanks @pkvarnfors for your feedback!
A new version of the tool is available with performance enhancements, improvements in memory usage and updated documentation.
Hello there! I work on https://transport.data.gouv.fr/ and I did some testing on the NeTEx validator, I wanted to share it with the authors & other users here. Thanks for your work on this!
I have not yet been able to review the output of the reports themselves, so this first round is more about onboarding / "surprise report" and potential production issue (RAM use etc), than the quality of the reports.
Hope this helps!
Installing from source is useful
Being able to install from source is useful in our case, because our main app is itself a "non privileged Docker container" at the moment ; we cannot run a docker command from there.
We cannot either run
go cmd/*
in production, because thego
binary won't be there in the final Docker image.I explored the Dockerfile and found how you actually build the binary:
https://github.com/ITxPT/DATA4PTTools/blob/c9c257389aec270c73b23977e9966c5f671e2d0f/Dockerfile#L7
Using this command allowed me to build a complete standalone binary which I can use.
Initially, the
cmd/*
pattern was a bit confusing to see, since it was a bit unclear if these were different programs, or parts of the same program (the latter was the answer), but that is not a big problem.The doc could be improved to document how to prepare a full binary.
There is telemetry by default
I was quite surprised to see that there is some hardcoded endpoint to which usage data appears to be sent by default.
I have created a specific issue for that part:
https://github.com/ITxPT/DATA4PTTools/issues/23
Although I understand the usefulness of this to you while building the tool, I believe this is problematic from a RGPD point of view.
To the very least I feel this should be advertised in the readme, or even an opt-in behaviour (at the cost of reducing your tracing rate).
I don’t believe most users will realize this is happening!
Time taken & memory used
Both time taken & memory used are a concern at the moment for production automated use (as part of server apps).
I have commented on existing issues:
-m
flag:Non-linear memory use (or too high use) will be a problem for production use, since we allocate a fixed number of GB per container, and cloud environments can be costly from a RAM point of view.
Documentation improvement
For docker run, it could be nice to assume that the user validates files in its current folder ; it can be traditionally done by using
$(pwd)
on Mac/Linux and%cd%
on Windows.The help at GitHub - ITxPT/DATA4PTTools: Shared space for the development of the DATA4PT validation tool(s) lacks the
-i myfile
flag at time of writing.Some dependencies should be upgraded
Warning on “sarama” while compiling from source · Issue #22 · ITxPT/DATA4PTTools · GitHub
That's it for today ; thanks for open-sourcing the tool & the discussion!