reanthony-usgs / US_Noise_PSDs

3 stars 1 forks source link

To Do list. #2

Closed aringler-usgs closed 5 years ago

aringler-usgs commented 5 years ago

We need to check the following things. I will do 4, 5, and 6, once Rob checks 1, 2, and 3 again.

1) I am seeing the pckls are a third smaller than the txt files. Around 128K while the text are 302K. Can you verify this is what you are finding and in the right ballpark? What would the new size calculation be?

2) I am finding the data request to be about a factor of 100 slower than the PSD calculation. Can you confirm this?

3) We should check the timing with reorganized code. Do my changes speed things up?

4) We need to deal with multiple location codes (rare, but could throw off calculations).

5) We should parallelize this (I can do this once you verify 1, 2, and 3).

6) We should include a log file for problematic pulls from IRIS (I can do this).

reanthony-usgs commented 5 years ago

I made a few modifications to Adam's code and re-ran it on the same 101 days of ANMO data I used in the original timing test. Observations

1) I am still seeing the Pickles as being ~150% larger than the flat files. Hourly Pickly PSDs are ~350K while text files are 240k. I still get roughly 25TB for text files and now 37 TB for pickles to do all of TA (assuming 2000 stations and 2 years of data).

2) I can confirm that the data request is now the bottle neck, however, I am only see it be a factor of 3-6 slower than the PSD calculation.

3) The revised code is slower and during my run had data completeness problems. The original code returned 100% of PSDs and this one only returned 59.6%. The issue seems to be during the repeated requests from the DMC (lots of exceptions thrown in a row). The original code returned all PSDs in 39 Minutes and this code took 54 minutes.

aringler-usgs commented 5 years ago

Hey I am getting the following: For 1 hour webservice requests:

It took 0.02124500274658203 s to write PSDs
It took 1.92600679397583  s to load in all the data
It took 0.03880476951599121 s to calculate the PSDs
Here is the txt file size:322369
Here is the pickle size:131224
It took 0.02248692512512207 s to write PSDs
It took 1.4095520973205566  s to load in all the data
It took 0.04191398620605469 s to calculate the PSDs
Here is the txt file size:322429
Here is the pickle size:131224
It took 0.02286505699157715 s to write PSDs
It took 1.3334143161773682  s to load in all the data
It took 0.04757404327392578 s to calculate the PSDs
Here is the txt file size:322331
Here is the pickle size:131224

The entire time for a 1 hour webservice request.

It took 73.65430402755737  s to do the entire computation

For a 1 day request I get the following:

It took 26.282557010650635  s to do the entire computation

For 2 day requests I start time get timeout issues.

You might make sure you are testing on the same windows that I am.

Here is a second look at them:

-rw-r--r--  1 aringler  staff   128K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00208.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00208.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00209.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00209.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00210.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00210.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00211.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00211.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00212.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:40 PSD_IU_ANMO_BHZ_2019_00212.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:41 PSD_IU_ANMO_BHZ_2019_00213.pckl
-rw-r--r--  1 aringler  staff   315K Aug 13 08:41 PSD_IU_ANMO_BHZ_2019_00213.txt
-rw-r--r--  1 aringler  staff   128K Aug 13 08:41 PSD_IU_ANMO_BHZ_2019_00214.pckl

So we agree on 2. However, 1 and 3 still show disagreement. Perhaps you can identify the source of error on these two and then assuming the timing is okay we can move forward.

aringler-usgs commented 5 years ago

We are done with 1) and left with 3).

aringler-usgs commented 5 years ago

*with python3.

reanthony-usgs commented 5 years ago

3) Adam's new changes to the code crushes it. Re-ran 2017 on N4 station T42B and it ran it 46 minutes. It took my original code 51.3 Hours. Assuming all stations go like this, all of TA should take ~12k-15k hours of computation time for all components. If we use 20 cores on our server, I get a little under a month of processing time.

The code also returns ~93% of the PSDs that we would expect if the station had 100% uptime (it probably doesn't). One little issue to sort out will be ensuring that each PSD estimate begins exactly on the hour. I will take a stab at this.

We probably will also want to set up some database/file directory to store the outputs. And perhaps an additional script to query this database.

reanthony-usgs commented 5 years ago

I think I have fixed the issue of every PSD starting exactly on the hour.

Need to scrub location codes as right now the code wild cards from the inventory. The locations codes are sometimes flipped in the stream, so the PSD database is a mix of loc codes. Want a consistent location code.

aringler-usgs commented 5 years ago

I would just do something like this, but switch to location codes.

def scrub_inventory(inv):
    for net in inv:
        for sta in net:
            chans_remove = []
            for chan in sta:
                if chan.code in ['HN1', 'HN2', 'HNZ', 'HNE', 'HNN']:
                    chans_remove.append(chan)
                if chan.code in ['LH1', 'LH2', 'LHZ', 'LHN', 'LHE']:
                    chans_remove.append(chan)
                if chan.code in ['VH1', 'VH2', 'VHZ', 'VHN', 'VHE']:
                    chans_remove.append(chan)
                if chan.code in ['UH1', 'UH2', 'UHZ', 'UHN', 'UHE']:
                    chans_remove.append(chan)
            for chan in chans_remove:
                try:
                    sta.channels.remove(chan)
                except:
                    pass
    return inv
aringler-usgs commented 5 years ago

Could you run using the following:

screen
python PSD*.py
crtl a + d
reanthony-usgs commented 5 years ago

Ran the first test run on Friday evening (8/23/2019). Issues Found so far:

1) When running a test case on a year, do not try to run over all of TA. Data does not exist for many of these stations due to the ~2 year emplacement of each TA site. Suggest "test case" flag on a small subset of Alaska TA stations.

2) PSD directories (line 39) do not appear to have been created for stations which did have data. The only files which appear to have been generated on this test run were log files

aringler-usgs commented 5 years ago

1. When running a test case on a year, do not try to run over all of TA. Data does not exist for many of these stations due to the ~2 year emplacement of each TA site. Suggest "test case" flag on a small subset of Alaska TA stations.

2. PSD directories (line 39) do not appear to have been created for stations which did have data. The only files which appear to have been generated on this test run were log files

Both of these things are done.

  1. Check out additional test runs (Find more bugs) @reanthony-usgs

  2. When I switch to LH* I get the following ValueError: noverlap must be less than nperseg. @reanthony-usgs

  3. Need to add rerequest mechanism. @aringler-usgs

  4. Need to add new mapping using cartopy @aringler-usgs

  5. Need to add parsing/summary code methods.

reanthony-usgs commented 5 years ago

4 is done - nfft must be shortened when using LH* data

Crashing on TA ALASKA 2018 run:/usr/lib/python2.7/site-packages/obspy-1.1.1-py2.7-linux-x86_64.egg/obspy/clients/fdsn/client.py:867: UserWarning: Unknown Error (BadStatusLine): '' warnings.warn(str(e)) /usr/lib/python2.7/site-packages/obspy-1.1.1-py2.7-linux-x86_64.egg/obspy/core/stream.py:3032: UserWarning: No matching response information found. warnings.warn(str(e)) Traceback (most recent call last): File "./PSD_Compression_Test.py", line 150, in pool.map(run_station, nets_stas) File "/usr/lib64/python2.7/multiprocessing/pool.py", line 250, in map return self.map_async(func, iterable, chunksize).get() File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get raise self._value TypeError: 'unicode' object is not callable /usr/lib/python2.7/site-packages/obspy-1.1.1-py2.7-linux-x86_64.egg/obspy/clients/fdsn/client.py:867: UserWarning: Unknown Error (BadStatusLine): '' warnings.warn(str(e)) /usr/lib/python2.7/site-packages/obspy-1.1.1-py2.7-linux-x86_64.egg/obspy/core/stream.py:3032: UserWarning: No matching response information found. warnings.warn(str(e))

aringler-usgs commented 5 years ago

I think I fixed this bug. It was an issue with grabbing bad metadata. Try it out again.

Check out additional test runs (Find more bugs) @reanthony-usgs

When I switch to LH* I get the following ValueError: noverlap must be less than nperseg. @reanthony-usgs

Need to add rerequest mechanism. @aringler-usgs

Need to add new mapping using cartopy @aringler-usgs

Need to add parsing/summary code methods.

reanthony-usgs commented 5 years ago

Another bug running all of TA LHZ. This was during the day with a pool of 30 processors. The load was in the low 30s.

Traceback (most recent call last): File "./PSD_Compression_Test.py", line 150, in pool.map(run_station, nets_stas) File "/usr/lib64/python2.7/multiprocessing/pool.py", line 250, in map return self.map_async(func, iterable, chunksize).get() File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get raise self._value TypeError: 'unicode' object is not callable

aringler-usgs commented 5 years ago

I have the following bug when I run your code:

File "noise_map.py", line 9, in <module>
    per, psd = utils.get_psd_stats(path, sta, stime, etime, 10.)
  File "/home/aringler/data_stuff/US_array/US_Noise_PSDs/utils.py", line 42, in get_psd_stats
    a = pickle.load(pickle_file)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 0: ordinal not in range(128)

I am testing this on J20K

aringler-usgs commented 5 years ago

I also ran into the following issue:

directs = path.split('/')
        f = directs[-1]
        info = f.split('_')
        if debug:
            print(info)
        year, time = info[5], info[6] 
        day, hour = time[0:3],time[3:5]

This is coded for your path not a general path.

aringler-usgs commented 5 years ago

Also, please move the collection of PSDs to /TEST_ARCHIVE_OLD

aringler-usgs commented 5 years ago

We need to add a flexible path to the PSD computation code.

aringler-usgs commented 5 years ago

All these issues have been dealt with. I am closing this.