DASDAE / dascore

A python library for distributed fiber optic sensing
Other
71 stars 16 forks source link

DASCore fails to read training data for phasenet DAS #409

Closed d-chambers closed 1 month ago

d-chambers commented 1 month ago

Description

@Shihao-Yuan reported that DASCore has issues reading some of the files associated with the phasenet DAS training data found here. We need to look into it more.

Versions

d-chambers commented 1 month ago

After some quick digging, I noticed that the Eureka data are in prodml and the Ridgecrest data are a custom format described here.

We could technically support the custom format. The only hangup is that the distance values for the cable are not provided, and there are other missing metadata (such as the gauge length) we might want. We could just assume the fiber distance starts at 0 I guess 🤷🏻.

I also found an issue in getting the metadata from prodml files likely introduced in #394. Currently, metadata like gauge length are not found due to this issue.

d-chambers commented 1 month ago

For the Ridgecrest data, @Shihao-Yuan mentioned the DAS data didn't look "right", and so there was some concern the data were being scrambled in some way. Plotting one of the files we get:

image

but this is similar to to what this notebook gets, so I think we are just seeing common-mode noise.

doing a high pass filter improves the image:

image

so after fixing the attribute issue above, I think we are ok for the prodml files.

d-chambers commented 1 month ago

Closed for now by #410, but we may implement read support for the custom DAS format later on.