Tests take too long to run

lsetiawan commented 3 years ago

Overview

The current tests takes way too long to run, especially when we have to interact with an external server.

Suggestions

I was thinking maybe we pick a few sample data and then stand up a local http server that serve those datasets to test against, rather from an external data repository. This way we still can test for the http/s3 read functionality without actually hitting a live server, which sometimes can timeout.

leewujung commented 3 years ago

Yes I was going to raise this issue. :) I think it's a good idea to go with a local server as you suggestion. Also I wonder if we can move those files currently in GitLFS there too, so that we don't run into the bandwidth limit (I found out that what I referred to before was the GitLFS limit and not GitHub actions).

emiliom commented 3 years ago

Are the latencies primarily on the external server end + download, or is some of it due to the size of the raw files (ie, processing time, not network latency)?

What are the smallest raw files we can use for meaningful testing? As I mentioned in https://github.com/OSOceanAcoustics/echopype/pull/240#issuecomment-765560547, we can use a 25 MB Hake Survey EK60 raw file. Do we need to test against a much larger OOI raw file, at least for EK60?? It doesn't seem like it. I haven't worked with AZFP files, so I have no clue there. And for broadband EK80, I can imagine that getting a small-ish raw file might be challenging.

Regardless, a "local" http server for testing seems like a good idea, assuming it's not a substantial new effort (and assuming I understand what that means).

lsetiawan commented 3 years ago

Are the latencies primarily on the external server end + download, or is some of it due to the size of the raw files (ie, processing time, not network latency)?

I think there are sometimes network latency or timeouts for the data straight from OOI. For example: https://github.com/OSOceanAcoustics/echopype/runs/1788810887?check_suite_focus=true#step:12:1916

I'm not sure about the others, but I think local network is better than outside when testing code, that way we don't depend on them and changes can be managed.

we can use a 25 MB Hake Survey EK60 raw file Is this the one from ncei-wcsd-archive bucket? If so, I am using that at the moment.

EK60 https://github.com/OSOceanAcoustics/echopype/blob/fa081884d0de3a6a41fa79813be6245e656de2a0/echopype/tests/test_convert.py#L324-L335

EK80 https://github.com/OSOceanAcoustics/echopype/blob/fa081884d0de3a6a41fa79813be6245e656de2a0/echopype/tests/test_convert.py#L472-L479

AZFP (For AZFP I'm using the OOI one) https://github.com/OSOceanAcoustics/echopype/blob/fa081884d0de3a6a41fa79813be6245e656de2a0/echopype/tests/test_convert.py#L394-L407

Regardless, a "local" http server for testing seems like a good idea, assuming it's not a substantial new effort (and assuming I understand what that means).

Yea we can do this 2 ways, have a http server docker image with the data in there, or have data get downloaded into the docker container when it starts up.

emiliom commented 3 years ago

Thanks for those details @lsetiawan !

but I think local network is better than outside when testing code, that way we don't depend on them and changes can be managed.

Of course. My point was to learn what the first-order cause of the slow test speed is: network latencies (including external server instabilities) vs processing time due to the size of the files.

Also, looking through the tests (first time I do that!), I see the testing of instrument-specific (EK60, etc) Convert functionality is mixed with the testing of the different source protocols (local, http, s3). The end result is a LOT of testing (lots of Convert runs), some of which seems redundant. I haven't looked at the code recently to remember how much of the fsspec machinery is generalized and instrument independent, though. Still, for example for EK60, the file Summer2017-D20170615-T190214.raw is converted four times in order to test 3 different access protocols and single file vs file lists.

lsetiawan commented 3 years ago

Convert functionality is mixed with the testing of the different source protocols (local, http, s3)

Yea I was trying to ensure that Convert is able to take in all the various protocol sources.

Still, for example for EK60, the file Summer2017-D20170615-T190214.raw is converted four times in order to test 3 different access protocols and single file vs file lists.

I was trying to test all the various cases. They are a little redundant, but I think each test captures a slight different case, if this is an overkill and if you have thoughts on how to slim it down, I am open to suggestions :smile:

Btw #246 contains the dockerfile that can be spun up to an http server container.

emiliom commented 3 years ago

I was trying to test all the various cases. They are a little redundant, but I think each test captures a slight different case, if this is an overkill and if you have thoughts on how to slim it down, I am open to suggestions

I'm viewing this as a matrix of tests, with instrument (3) x access protocol (3) x output protocol (2), plus handling of single paths vs lists of paths. That's what's present right now. But I think many of these are effectively redundant -- they don't test for something truly unique.

The raw file protocol and read handling for EK60 and EK80 is handled by common code in convert.utils.ek_raw_io.RawSimradFile.__init__, called by convert.parse_base.ParseEK.parse_raw. Convert does three things (for this discussion):

open and read the raw file, wherever it's found
actually parse it
save it to file

where parse_raw handles the first two.

Tests should aim to encompass each of those functions w/o needing to be redundant. B/c 1 is handled in common for the two EK's, it only needs to be tested for once; ie, one local, one http and one s3 test, but no need to do all 3 on both EK's. Parsing (2) obviously needs to be done independently for the two EK's. The same might be true of saving (3), but I haven't looked closely (we should look more closely, though).

So, my recommendation is to remove the EK80 http and s3 tests (b/c EK80 files are larger than EK60 files, right), and let the corresponding EK60 tests serve the purpose of testing those protocols for EK's. I might also eliminate the EK60 single-path http test, changing this:

    "input_path",
    [
        "./echopype/test_data/ek60/ncei-wcsd/Summer2017-D20170615-T190214.raw",
        "https://ncei-wcsd-archive.s3-us-west-2.amazonaws.com/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190214.raw",
        "s3://ncei-wcsd-archive/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190214.raw",
        [
            'https://ncei-wcsd-archive.s3-us-west-2.amazonaws.com/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190214.raw',
            'https://ncei-wcsd-archive.s3-us-west-2.amazonaws.com/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190843.raw',
        ],
    ],

to this:

    "input_path",
    [
        "./echopype/test_data/ek60/ncei-wcsd/Summer2017-D20170615-T190214.raw",
        "s3://ncei-wcsd-archive/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190214.raw",
        [
            'https://ncei-wcsd-archive.s3-us-west-2.amazonaws.com/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190214.raw',
            'https://ncei-wcsd-archive.s3-us-west-2.amazonaws.com/data/raw/Bell_M._Shimada/SH1707/EK60/Summer2017-D20170615-T190843.raw',
        ],
    ],

test_convert_ek60 would then be serving triple duty, testing the input protocols (open and read), the single-path vs path list, and the actual EK60 conversion (parsing and saving).

I can submit a PR with these suggested changes, but @leewujung should comment on this first.

leewujung commented 3 years ago

@emiliom and @lsetiawan : thanks for digging into this. I need to hop on a call in a few mins but some quick responses below:

I'm viewing this as a matrix of tests, with instrument (3) x access protocol (3) x output protocol (2), plus handling of single paths vs lists of paths.

It's a tensor! 🤣

B/c 1 is handled in common for the two EK's, it only needs to be tested for once

Not really. While it uses the same parser code, the EK60 and EK80 are encoded in different types of datagrams, so the parsing needs to be tested separately.

Additional complexity for EK80 data: users have the options to select all or only a subset of channels to be recorded as "power only" or "power + angle" (for which you get real numbers in backscatter), and by default the data are recorded as complex numbers. Parsing both of these is required. I recommend using the file that contains both of these types of channels (it is probably noted as BB+CW in @ngkavin 's list). There are also files where some channels were set to not collect files at all!

For later

Once we are sort of back to in-person work it'll be easier to collect some test files that are small and made for just the purpose of testing different scenarios, so that we don't need to always be handling huge files. (Yes, even for EK80 it is possible to collect small files, just not tiny files). Let's hope we get there soon!

emiliom commented 3 years ago

B/c 1 is handled in common for the two EK's, it only needs to be tested for once

Not really. While it uses the same parser code, the EK60 and EK80 are encoded in different types of datagrams, so the parsing needs to be tested separately.

My 1 covers what I called "open and read" (or: find and open the floodgate), not "parsing" strictly speaking; "parsing" is my 2, which of course is different for each instrument. How the data are encoded in datagrams, whether it's all or a subset of channels, and so on, that's downstream of the common EK code that opens the file; and as far as I can tell, an fsspec failure will be independent of which individual binary segments the parsing code wants.

lsetiawan commented 3 years ago

My 1 covers what I called "open and read" (or: find and open the floodgate), not "parsing" strictly speaking; "parsing" is my 2, which of course is different for each instrument. How the data are encoded in datagrams, whether it's all or a subset of channels, and so on, that's downstream of the common EK code that opens the file; and as far as I can tell, an fsspec failure will be independent of which individual binary segments the parsing code wants.

This still doesn't quite click in my mind Emilio :stuck_out_tongue: Feel free to create a PR with the changes you have in mind :smile: Thanks for your help!

emiliom commented 3 years ago

Adding some data points on the topic of how long the tests take. Caveats: this is based on n=1; I ran the tests locally on my old Ubuntu laptop, using the current test_convert.py in class-redesign; I was timing things manually by issuing date commands or looking at a clock that didn't show seconds; and I wasn't able to time the docker-compose build command, b/c I had run it previously and I'd have to remove the built docker images to get at this.

docker-compose -f .ci_helpers/docker/docker-compose.yaml up: 1 min
docker cp -L ./echopype/test_data docker_httpserver_1:/usr/local/apache2/htdocs/data: instantaneous (but note that I ran into this error on the first try, then it worked fine on the second try: ERRO[0001] Can't add file /usr/mayorgadat/workmain/acoustics/gh/OSOceanAcoustics/echopype/echopype/test_data/ek80/Summer2018--D20180905-T033113.raw to tar: archive/tar: write too long)
python -m pytest --log-cli-level=WARNING --verbose echopype/tests/test_convert.py: 14:49 minutes
- ek60 and prior: 2 min
- azfp: 2 min
- ek80: 11 min

leewujung commented 3 years ago

Since that longest time came from converting EK80 files due to their large size, this may be the best we can do right now. I'll move this issue out from 0.5.0 milestone, since it is tied to when we can collect some new files.

emiliom commented 3 years ago

Extra comment from @lsetiawan, from our call on Monday: the other major time sink is the building of the conda environment. For now there seems to be little we can do to speed it up (though I am curious exactly what the cache GH action does; does it cache part of the conda env, so it's not built from scratch every time?). Though one thing that comes to mind is that installing everything in requirements-dev.txt is overkill for what's actually being used, at least in the current tests (test_convert.py); maybe a minimal set of packages could be specified explicitly in class-redesign/.github/workflows/ci.yaml#L82, instead of via requirements-dev.txt.

As for the EK80 tests, I still think there's unnecessary redundancy (though Don already slimmed the EK80 test suite down a bit). I'm going to run some tests and report back.

Anyway, these are not first-order issues.

leewujung commented 3 years ago

Per discussion today with @lsetiawan and @imranmaj :

replace the test data files with new, smaller files on GDrive first
and then decide whether to move files to repo

A related issue is #263 .

One possibility to speed up tests is to run tests in parallel, but currently it does not work and we still need to investigate the reason: @lsetiawan 's last test was here: https://github.com/OSOceanAcoustics/echopype/tree/master/echopype/test_data numprocesses=4

leewujung commented 3 years ago

We are now in the position to address this. 😄

lsetiawan commented 2 years ago

PR #556 sets the numprocess back to 4. Now the test run takes about ~15min total rather than ~30min.

leewujung commented 2 years ago

Also related to #556.

OSOceanAcoustics / echopype