DASDAE / dascore

A python library for distributed fiber optic sensing
Other
71 stars 16 forks source link

spool.concatenate with new dimension has incorrect shape #404

Closed d-chambers closed 1 month ago

d-chambers commented 2 months ago

Description

Specifying a new dimension in BaseSpool.concatenate returns a spool whose patch has only a length of 1 in the new dimension, regardless of the input shape.

Example

import dascore as dc

sp = dc.get_example_spool()
new = sp.concatenate(new_dim=1)
patch = new[0] 

assert patch.shape[-1] == len(sp)  # fails

Expected behavior

The new dimension should be the same as the length of the spool.

Versions

d-chambers commented 2 months ago

Turns out this isn't a bug. In the example above the time dimensions don't match so the patches cannot be concatenated along a new dimension. I will create a PR that clarifies this in the docstrings.

ahmadtourei commented 2 months ago

Yes that is correct. I have patches with dft history so they still have the time coordinate but two dimensions (distance and ft_time) which are identical. What would be the best strategy to concatenate them? I can update their coords and drop the time coord so I can concatenate but that way I can't do idft on them.

ahmadtourei commented 2 months ago

can we add an ignore_coord argument support so we can concatenate these patches?

d-chambers commented 2 months ago

So the flag would indicate if non-dimensional coords should be dropped right? Or do you have to name the coords to ignore?

ahmadtourei commented 2 months ago

Either way would work. We can have an ignore_coord="time" to ignore (and not to drop) the time coord when concatenating the patches. Or, having something like conflict="ignore_diff_coords" to ignore all unidentical coords (such as the time coord in my dft patches).

d-chambers commented 2 months ago

Ok, let me think about this a bit more and circle back next week. Ignoring, but not dropping, the coords gets tricky because the incompatible coords need to be included in a single patch. We could make a 2d coord in that case, or we could create separate coords for time_min, time_step, and time_end that would each be 1d across the new axis.

ahmadtourei commented 1 month ago

I thought a bit more about it and agree with you about how ignoring the coordinates can be challenging and dropping would make more sense. I'd suggest we include an example of dropping the coord before concatenating to clarify this in the spool's tutorial. I have also implemented a new test for this that I'll push in a bit.