Data shapes not matching

yalaudah / facies_classification_benchmark

The repository includes PyTorch code, and the data, to reproduce the results for our paper titled "A Machine Learning Benchmark for Facies Classification" (published in the SEG Interpretation Journal, August 2019).

MIT License

114 stars 63 forks source link

Data shapes not matching #5

Closed suellenmotta closed 5 years ago

suellenmotta commented 5 years ago

Hello!

Are the split sizes described in the paper correctly matching the data files shapes? In your train/test split session you state:

Training set: Inline range: [300-701] (402 inlines) and crosslines [300-1000] (701 crosslines)
Testing set 1: Inline range: [100-299] (200 inlines) and crossline range: [300-1000] (701 crosslines)
Testing set 2: Inline range: [100-701] (602 inlines) and crossline range: [1001-1200] (200 crosslines)

But I just opened the corresponding label files, whose shapes give me:

(401,701,255)
(200,701,255)
(601,200,255)

Are the inlines wrong in the volumes 1 and 3 or am I missing something?

Thank you!

yalaudah commented 5 years ago

Hi @suellenmotta, thanks for bringing this up!

I double checked the code that was used to generate the train/test split, and it seems I copied the number of inlines to the paper incorrectly. The data is correct, but the paper should have listed the split as:

Training set: Inline range: [300-700] (401 inlines) and crosslines [300-1000] (701 crosslines)
Testing set 1: Inline range: [100-299] (200 inlines) and crossline range: [300-1000] (701 crosslines)
Testing set 2: Inline range: [100-700] (601 inlines) and crossline range: [1001-1200] (200 crosslines)

I've corrected this in the paper, and will upload the fixed version to arXiv soon.

Again, thanks for spotting this mistake! If you have any further questions please let me know, otherwise, I will close this issue.

suellenmotta commented 5 years ago

Thank you, @yalaudah! For the data and answer!

Just another thing: would it be possible to make the whole label and seismic volumes available? I mean the train and test parts together and maybe along with the regions not used in your tests.

If not, did you convert the F3 time volume to depth using the Opendtect, right? Which F3 velocity model did you use? Velocity_modelINT_.cbvs or Velocity_modelRMS_.cbvs?

Thank you very much!

yalaudah commented 5 years ago

Yes, the raw data (horizons, faults, and the combined seismic data and labels) are available here: https://www.dropbox.com/s/jken23jed6cbjhc/raw.zip

This was mentioned in the README file.

As for the time-to-depth conversion, i think Velocity_model_INT.cbvs was used (it was done by one of my co-authors, so I would have to double check with them). The conversion led to some artefacts along the boundaries, and therefore, we decided to only provide the regions without any artefacts.

suellenmotta commented 5 years ago

Thank you very much, @yalaudah, I just downloaded the raw data. It will be very helpful!

yalaudah commented 5 years ago

I'm glad you find it helpful @suellenmotta . Please remember to cite our paper (https://arxiv.org/abs/1901.07659) if you use the code or the data. Thanks!