Closed hadigh closed 2 years ago
This appears to be a problem with the IRIS code in COSMOS Table 4 in this row:
60 | Central and Eastern US Network | UCSD | NA
The NA
is getting parsed as NaN rather than a string. But also I don't think that NA
is the correct code for "Central and Eastern US Network". It probably should be N4
. But that doesn't make sense with UCSD as the abbreviation (N4 should be ASL/USGS). Since we didn't make this table, I'm not sure we should modify it. I'll try to make it so that this doesn't raise an error, but we need to get clarification from COSMOS on what was the intended values here.
Hi, I am looking into the discrepancy and will let you know when we resolve it.
Nicholas Novoa COSMOS President
Thanks @NNovoa-CDWR!
@emthompson-usgs
Thanks for looking into this, I can now successfully generate V2 files for my example, but getting the below error when setting the label to label='unprocessed'
File ~/Tools/groundmotion-processing/gmprocess/io/cosmos/cosmos_writer.py:734, in CosmosWriter.write(self) 732 t2 = time.time() 733 t_write.append(t2 - t1) --> 734 text_av = sum(t_text) / ntraces 735 int_av = sum(t_int) / ntraces 736 float_av = sum(t_float) / ntraces
ZeroDivisionError: division by zero
Thanks for the report. I'll look into this.
The underlying reason for this is that when @mhearne-usgs wrote this first version of the writer, he only included support for data that has been converted to physical units of acceleration (see here). This is why none of the unprocessed records get included and this leads to a division by zero error. We can potentially add support for raw data with units of counts, or fix this so that it exits gracefully with a better error message but I'll need to consult @mhearne-usgs about this.
@hadigh I just sent in a PR that will avoid the error and add a logging statement to indicate that no traces were processed in this case.
The question I have for you is: How important is it for you to be able to write raw data with the COSMOS file? My (possibly incorrect) impression is that this format is mostly intended for people that are not able to directly make use of the ASDF format, which is exhaustive in terms of data/metadata. I expect that unprocessed/raw data may not be desired by these users. In fact, they won't have access to the instrument response without the ASDF file and so they wouldn't be able to convert to physical units for their analysis.
Also, one suggestion for your script would be to add the logger so that logging messages get printed like this:
from gmprocess.utils.logging import setup_logger
setup_logger()
@emthompson-usgs thanks for clarifying this and also the hint for logging.
what you mentioned regarding the instrument response and user interest makes perfect sense, what I am after is somehow between COSMOS V1 and V2: time-series in the physical unit only by applying basic baseline adjustment and instrument response correction and no additional filtering! perhaps I can achieve this by storing two versions of the processed time-series under different labels in the h5 file!
Yeah, that seems reasonable. In fact the only sensible use of alternative labels that I can think of would be something like this, where you've processed the records using two different sets of processing parameters. We haven't had any call for that ourselves and so I don't think that there's a simple way to automate this (e.g., use a different config file for different labels for a given project). For now, I'll close this issue. If you have ideas about changes to the code that would facilitate this, please create a new issue.
Regarding the logger: that is what gets done when the gmrecords
command starts. It would be nice if we had a way of having it setup automatically for scripts like this, but I don't know how to do that.
cosmos_write_example.zip Hi gmprocess team,
I am trying to write a cosmos file for processed streams in a .h5 file. I am using the 'CosmosWrite', and am getting the AttributeError:
**File ~/Tools/groundmotion-processing/gmprocess/io/cosmos/cosmos_writer.py:155, in Table4.get_matching_network(self, eventid) 153 eventid = eventid.lower() 154 for idx, row in self._dataframe.iterrows(): --> 155 network = row["IRIS Code"].lower() 156 if eventid.startswith(network): 157 return network
AttributeError: 'float' object has no attribute 'lower'**
Any help would be appreciated