SeisComP3 / seiscomp3

SeisComP is a seismological software for data acquisition, processing, distribution and interactive analysis.
Other
111 stars 87 forks source link

Allow for more than 1 waveform archive with recordstream.service = combined #9

Closed yannikbehr closed 9 years ago

yannikbehr commented 9 years ago

To access waveforms in scolv's picker we configure recordstream.service = combined with seedlink and an sds archive. However, due to limited disk space we can only keep the last five weeks in the sds archive. It would be great if we could request older waveforms using fdsnws. The fdsnws access to data is much slower than the sds archive access that's why we would like to keep the latter for the most recent events.

gempa-jabe commented 9 years ago

The difficulty here is the definition of the URI. Currently it is combined://slink/localhost;arclink/localhost?user=bla??slinkMax=7200 whereas ? defines an option of the Arclink source and ?? a global option of the combined source. How to introduce a third source syntactically and preferably remove the ???

combined://slink?max=7200/:18000;arclink?max=3d/localhost:18001?user=foo&pwd=bar;sdsarchive?max=1y//home/data/archive;sdsarchive//home/data/archive2

And now try to combine that with e.g. the decimation source: http://www.seiscomp3.org/doc/jakarta/current/apps/global_recordstream.html#decimation

Opinions?

gempa-jabe commented 9 years ago

Actually that would work quite well:

dec://combined?rate=1/slink?max=7200/:18000;arclink?max=3d/localhost:18001?user=foo&pwd=bar;sdsarchive?max=1y//home/data/archive;sdsarchive//home/data/archive2

Only combining a combined source would cause some headache for the parser.

Anyhow, I would recommend to use another name for the source if syntax changes too much, e.g. "???" ;)

yannikbehr commented 9 years ago

Combining two combined sources gives me also a headache. Is this currently possible?

gempa-jabe commented 9 years ago

Technically yes but not via the URI. The semicolons are the problem that are used as separator. There is no grouping (brackets or something else) in the definition. One could think of something like:

combined://source1/(params1);source2/(params2)

and it could work as

combined://combined/(slink/localhost:18000;archlink/localhost:18001??rtMax=7200);sdsarchive//path/to/archive??rtMax=31104000

This issue is then obsolete because it can be defined like that. I am currently unsure what to prefer ...

yannikbehr commented 9 years ago

If you allowed to combine combined sources you could end up with two real-time sources (combined://combined1/(params1);combined2/(params2)). Then you'd have to decide which one takes precedence. Sounds very complicated to me.

jsaul commented 9 years ago

Jan Becker wrote on 02/09/2015 03:25 PM:

The difficulty here is the definition of the URI. Currently it is |combined://slink/localhost;arclink/localhost?user=bla??slinkMax=7200| whereas ? defines an option of the Arclink source and ?? a global option of the combined source. How to introduce a third source syntactically and preferably remove the ???

How about using brackets?

combined://(source1);(source2)??rtMax=7200

where each of source1 and source2 can in turn be nested. In a config file, source1 and source2 could be defined as parameters, e.g.

realtime=slink://server1:18000 archiveshort=arclink://server2:18001 archivelong=fdsnws://server3/fdsnws/dataselect/1/query

and then archive=combined://(${archiveshort});(${archivelong})??rtMax=2d

finally:

combined://(${realtime});(${archive})??rtMax=2h

Not really eye-candy either, especially not if they need to be combined in a single expression, but still manageable.

gempa-jabe commented 9 years ago

Then you'd have to decide which one takes precedence. Sounds very complicated to me.

No, the rules are simple: the first source is used if time span is less than rtMax, the second source otherwise and so on (in case of combined combined). It is always up to the user to define meaningful and working URIs. What would you prefer as you opened that issue: two and more sources or combining combined sources?

jsaul commented 9 years ago

Either way, brackets are the way to resolve this, I think.

gempa-jabe commented 9 years ago

combined://(${realtime});(${archive})??rtMax=2h

Here I would like to avoid duplicate :// and use:

combined://slink?max=2h/server1:18000;combined/(archlink?max=2d/server2:18001;fdsnws/server3/fdsnws/dataselect/1/query)
gempa-jabe commented 9 years ago

Either way, brackets are the way to resolve this, I think.

Just a matter of personal taste if we would like to support combined source with more than two sources (no brackets required) or combined combined. I think I currently prefer the latter option but that can change in a few minutes ;)

EDIT: actually parenthesis ...

yannikbehr commented 9 years ago

I don't have any preference, whatever is easier to implement. I just can't think of a situation where you would want to connect to two different seedlink servers (since both need to have an identical channel list, right?).

jsaul commented 9 years ago

yannikbehr wrote on 02/09/2015 04:19 PM:

I don't have any preference, whatever is easier to implement.

Easiest is probably to live with the current combinded:// syntax and its limitations. You could probably achieve what you want by running a local, auxiliary fdsnws instance that uses combined:// in order to access your sds short-term archive and the fdsnws long-term archive. You then combine the real-time seedlink source with the auxiliary fdsnws in scolv. Wouldn't that be feasible?

gempa-jabe commented 9 years ago

I added a42aedb316fb5e37d61bbe12a261407612417f27 which allows to configure the combined source with parenthesis, e.g.:

combined://slink/rt-server;combined/(arclink/ar-server;fdsnws/url??1stMax=2d)

Note that rtMax or slinkMax got a new alias 1stMax. Also time spans can be configured with an additional and optional suffix (default = s):

Suffix Multiplicator
s 1
m 60
h 3600
d 86400
w 86400*7

Let me know if that works for you.

yannikbehr commented 9 years ago

Works perfectly. Thanks!