OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
94 stars 73 forks source link

Produce test data set #5

Closed leewujung closed 5 years ago

leewujung commented 6 years ago

Produce small data sets that contains only a couple pings for testing purposes (#4 travis-ci)

Need data from:

erinann commented 6 years ago

This is issue number 8 from OHW18_echopype.

amarburg commented 5 years ago

And also a plan for how to distribute the test data when running tests. Git submodule? Just check it into the repo? Download from the cloud?

leewujung commented 5 years ago

I'll push up what i have done for folder restructuring once I get the minimum fake test working :P I think it's convenient to just have sample data under ./echopype/data as recommended by shablona -- currently I have a 1MB EK60 file and a 5MB AZFP file. Planning on finding someone with the echosounders to generate a test file that just have a few pings.

amarburg commented 5 years ago

It's a good question. I guess test data is in the Git repo but is not necessarily included in the Python package ... I'd make it a priority to keep the python package as small as possible.

leewujung commented 5 years ago

I think since this package is for unpacking binary/hex data, it is a good idea to include minimum original data files and unpacked files so that users can have something to check against. This can be specified in package_data within setup.py.

amarburg commented 5 years ago

Yeah, I agree. It was just a question between putting the data in the Git repo and maybe storing it somewhere else and needing to pull it in. If the test files are going to be small, (10s MB) then it's a moot point ... I just didn't have an expectation on how big (or small) the test files would be.

leewujung commented 5 years ago

The EK60 converter (#22) is currently tested against /echopype/data/DY1801_EK60-D20180211-T164025.raw

leewujung commented 5 years ago

Seems like it's fine to use the small-ish data set (1-5 MB range) for now.