gempy-project / gempy

GemPy is an open-source, Python-based 3-D structural geological modeling software, which allows the implicit (i.e. automatic) creation of complex geological models from interface and orientation data. It also offers support for stochastic modeling to address parameter and model uncertainties.
https://gempy.org
European Union Public License 1.2
980 stars 233 forks source link

v2 topography issues #179

Closed cfandel closed 5 years ago

cfandel commented 5 years ago

When adding topography to the model, the 2D geologic map plots appear to be flipped along the y axis. I'm using the same DEM to cut with as I used to define the model grid, so the coordinate systems should be identical.

In 3D, the map appears correct, but the interface surfaces are not visible below it (I assume this is the scikit-image issue?). I don't absolutely need to see the surfaces in 3D, but I do need to be able to cut the model with topography, view the 2D results, and export the cropped lith_block.

I tried to get around this by plotting a 2D geologic map manually using the same approach I was using in GemPy v1.0, but I'm running into a problem with the compute_model_at() function. When I feed it a list of xyz coordinates at the model surface, it returns a lith_block object that is the wrong dimensions (there are more items in lith_block than in the list of coordinates it is supposed to calculate at).

The notebook where I am testing this can be found here: https://github.com/cfandel/gottesacker/blob/master/gempy_v2_isfault.ipynb

Also, a separate issue: the model solution I get with GemPy v2 seems significantly different than with GemPy v1 (you can see the old map if you scroll down in the notebook above). Did you make major changes in how the interface surfaces are generated? I though maybe it could just be looking different because of how the topography is cut, but I can't check because I'm having all these visualization problems. :(

elimh commented 5 years ago

Hey, good to hear that you experienced the same problems as I did when I converted my model notebooks to the new structure! :D

For the flipped geological maps I am responsible, but I've already fixed that and made a pull request.

If you want to work around this again until @Leguark has merged my pull request, take a look at the tutorial notebook ch1-3-Grids. There you can see how the coordinates of the topography are stored in geo_model.grids and how you can access them. The resulting geological map after the model calculation is stored in geo_model.geological_map, which is similar to the resulting geological map of the old function gp.compute_model_at(surface_coordinates). This should make it possible to fully integrate it into your workflow!

About your separate problem: I also realized that the resulting maps and models calculated with v2.0 look slightly different from the old version, but not as different as in your notebook. I think/hope after you've reversed the topography, your results should be more or less similar.

Leguark commented 5 years ago

Regarding the differences between gempy 1 and 2 in the geological map, they have to do with the max and min value of the scalar field on the answer. Usually if you interpolate the geological map and the regular grid together (like we do in GemPy v2 by default) you are going to get more consistent results in 3D.

In short, the new map should be better but the best way to be sure is to plot the surfaces and topography in vtk.

cfandel commented 5 years ago

Thanks! I wanted to test this with the tutorial data since it's simpler than mine, but when I use the DEM file to set the topography, I again get a geologic map that is the wrong dimensions - it should be 50 * 50 = 2500 entries long, but instead it's much smaller, only 49 entries. When I use a randomly-generated topography I don't get this problem.

This is similar to the problem I was having with my own data, where compute_model_at() returns an object that doesn't match the dimensions of the model grid.

Also, with my own data, I can flip the geologic map so that it displays with north up, but it still looks significantly different than what I was getting with GemPy v1. I'm not sure if that's because the model itself is different, or if it's because the model is flipped north-south before it gets cut by the topography. If it's flipped before it gets cut, I'm not sure from just the tutorial info how to fix that - I would normally have tried compute_model_at(), but since it's behaving unexpectedly I can't .

Any ideas?

Note: It's hard to check since I can't plot the surfaces and the topo at the same time in VTK - only one at a time.

Leguark commented 5 years ago

That is weird, this figure you get it by running the onlap/erode tutorial. If you activate the regular grid and topography at the same time, you should be able to get something similar

image

cfandel commented 5 years ago

Ah OK - so I was able to visualize the layers and the topo at the same time in VTK with:

#Set topography:
geo_model.grid.set_topography(source='gdal', filepath=demfile)               #set the model topography from a file
geo_model.grid.set_active('topography')
geo_model.grid.set_active('regular')

#Run model:
interp_data = gp.set_interpolation_data(geo_model, output='geology', theano_optimizer='fast_compile') #when using topography, must set output='geology'? (why?)
sol = gp.compute_model(geo_model, compute_mesh=True)                                   #to see both topo and layers, must set compute_mesh=True

compute_mesh has to be set to True, and both grids have to be activated. This solves the visualization problem, but not the problem that the geologic map and compute_model_at() both give an incorrectly-sized array when using topography from a DEM (the problem doesn't occur with randomly-generated topo).

Also, when I do this for my data rather than the tutorial data, if I turn compute_mesh on, I get another dimension error, printed below (but if compute_mesh is off, no error, but also no VTK visualization). The x and y axes seem to be flipped (switching the x and y resolution in init_data does not fix the problem). Since I'm using the DEM to set my model grid in the first place, there shouldn't be any mismatch. When I use a randomly-generated topography, I also get this error. If I don't use topography at all, then I don't get the error.

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in mask_topo(self, mask_matrix)
    173 
    174     def mask_topo(self, mask_matrix):
--> 175         a = (~self.grid.regular_grid.mask_topo) * mask_matrix
    176         return a
    177 

ValueError: operands could not be broadcast together with shapes (200,251,26) (251,200,26) 
Leguark commented 5 years ago

it seems you have a nice case there for us to debug that part of GemPy (which definitely needs some work). Can we access the notebook and the data from somewhere?

cfandel commented 5 years ago

That would be SUPER helpful if you want to look into it some! If I make the model square (same x and y resolution), I don't get this error (but then I've put in inaccurate dimensions and it's not actually useful at that point). I'm having some unrelated GitHub issues right now, so I manually uploaded a copy of both notebooks here:

gottesacker data testing: https://github.com/cfandel/gottesacker/blob/master/gempy_v2_isfault.ipynb tutorial data testing: https://github.com/cfandel/gottesacker/blob/master/gempy_tutorials.ipynb

Note that despite the title I turned the faults off until I can fix the other problems.

Also, here is what the map output from GemPy v1 looked like (seems closer to correct): image

Leguark commented 5 years ago

You have a serious amount of code already there. I am eager to take a better look!

I just cloned your repo and try to run the gempy_v2_isfault notebook but I got an error because in your manager you still have the order_series argument of gempy v1. Am I in the wrong branch?

Screenshot 2019-05-22 at 09 05 34
cfandel commented 5 years ago

Derp sorry about that! Didn't give you the right version of the GK_all_pts.csv file. All the code at the beginning you can mostly ignore - it's set up that way because in my main script I run GemPy a whole bunch of times with different settings, so I need to keep track of what data and parameters I used for each run, and it was faster to have Python create and save the input data files than to do it manually each time (that's what the manager is for). The order column is so I can reshuffle things when selecting data points. I just uploaded the newer version of the GK_all_pts.csv file (here: https://github.com/cfandel/gottesacker/blob/master/data/GK_all_points.csv). That should work. Still working in fixing the GitHub issue so I have to do manual uploads for now sorry! :(

elimh commented 5 years ago

Thanks! I wanted to test this with the tutorial data since it's simpler than mine, but when I use the DEM file to set the topography, I again get a geologic map that is the wrong dimensions - it should be 50 * 50 = 2500 entries long, but instead it's much smaller, only 49 entries. When I use a randomly-generated topography I don't get this problem.

One important thing to mention here is that both geo_model.grid.regular_grid and geo_model.grid.topography have their own resolutions - in the case of the tutorial notebook the file bogota.tif has a very low resolution (7*7), so the resulting map has only 49 entries. The reason why this does not happen with randomly generated topographies is that the default resolution of the topography is the resolution of the regular_grid (but it can also be increased or decreased).

The resolution of the topography is used for the representation of the geological map, while the resolution of the regular grid is used for sections or block models. This means that you could reduce the resolution of your Gempy model (also to save computing time) and still get a high-resolution map.

cfandel commented 5 years ago

That's really useful to know! The resolution of the topography is separate from the problem I'm having with compute_model_at() though - I would expect an output of the same size as the number of points I input, but instead the output is larger. Do you know why that happens?

elimh commented 5 years ago

Instead of using compute_model_at(), I would set a custom grid with the surface coordinates. This works similar but is more reliable. I tried it in your notebook (lower part), here you can see how it works:

https://github.com/cgre-aachen/gempy/blob/topography_issues/notebooks/chloe/gempy_v2_isfault.ipynb

cfandel commented 5 years ago

OK, so you are computing the model at the surface points by setting a custom grid? That makes sense. Can you explain what l0 and l1 correspond to? Are they the starting and ending indices in sol that correspond to the lith_block for that grid? Could you slice solusing the indices for any of the grids to get the output for that grid? And when you reshape the result, why are you reshaping it with the resolution of the 'topography' grid and not the resolution of the 'custom' grid? If you are working with multiple grids, how do you keep track of which grid resolution to reshape with? This will be really important for me later too when I need to export the model outputs to gdal.

Thanks so much for all the help!

Also, a separate problem: When you compute the model initially (block 31), are you able to do that with the topography set? For me, gp.compute_model(geo_model, compute_mesh=True) gives me a dimension error (but it only happens if I try to set the topography from a raster file):

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-3dfd6c4329d9> in <module>
      3 #gp.set_interpolation_data(geo_model, theano_optimizer='fast_compile') #compile theano and interpolate the input data
      4 
----> 5 sol = gp.compute_model(geo_model, compute_mesh=True)               #compute the model from the data - returns an output (sol) which includes lith and faults, each with two arrays, one for the id of the formation in each cell, one with orientations in each cell
      6 #need to set compute_mesh=True in order to see the interfaces in 3D
      7 #if compute_mesh=False, model computes geologic map but not interfaces in 3D

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\gempy_api.py in compute_model(model, output, compute_mesh, reset_weights, reset_scalar, reset_block, sort_surfaces, debug, set_solutions)
    424     elif set_solutions is True:
    425         if model.grid.active_grids[0] is np.True_:
--> 426             model.solutions.set_solution_to_regular_grid(sol, compute_mesh=compute_mesh)
    427         # TODO @elisa elaborate this
    428         if model.grid.active_grids[2] is np.True_:

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in set_solution_to_regular_grid(self, values, compute_mesh)
     89         if compute_mesh is True:
     90 
---> 91             self.compute_all_surfaces()
     92             # except RuntimeError:
     93             #     warnings.warn('It is not possible to compute the mesh.')

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in compute_all_surfaces(self, **kwargs)
    221         self.vertices = []
    222         self.edges = []
--> 223         self.padding_mask_matrix()
    224         series_type = self.series.df['BottomRelation']
    225         s_n = 0

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in padding_mask_matrix(self, mask_topography)
    187 
    188             if mask_topography and self.grid.regular_grid.mask_topo.size != 0:
--> 189                 mask_pad = self.mask_topo(mask_pad)
    190 
    191             self.mask_matrix_pad.append(mask_pad.T)

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in mask_topo(self, mask_matrix)
    173 
    174     def mask_topo(self, mask_matrix):
--> 175         a = (~self.grid.regular_grid.mask_topo) * mask_matrix
    176         return a
    177 

ValueError: operands could not be broadcast together with shapes (12,10,9) (10,12,9) 
cfandel commented 5 years ago

Also, one other question: I took a small subset of my data and ran gempy with it to look at the resulting geologic map compared to the data points. All the points plotted here are strike & dip measurements taken at the land surface, usually somewhere inside a geologic unit (not on the interface) - so the map unit at each point should be the same as unit for the point. But instead I get major discrepancies (especially notice the center right of the map - a whole bunch of Schrattenkalk measurements are there, but the model thinks it's the basement). Does this mean that there is something going wrong with the map visualization, or with the model itself? It just seems way, way off and I can't figure out why.

image

Note: There are two interface points for each unit also, the bare minimum, not shown here. And I computed the model with a pretty high vertical resolution to check if that was the source of the problem, but it doesn't seem to be.

Update: It seems to be something with the model itself - when I plot just the unit interface surfaces in 3D, they don't match up with the data at all: image

elimh commented 5 years ago

OK, so you are computing the model at the surface points by setting a custom grid? That makes sense. Can you explain what l0 and l1 correspond to? Are they the starting and ending indices in sol that correspond to the lith_block for that grid? Could you slice solusing the indices for any of the grids to get the output for that grid?

Yes, exactly! I've updated the notebook linked above, maybe it's clearer when you look at it.

And when you reshape the result, why are you reshaping it with the resolution of the 'topography' grid and not the resolution of the 'custom' grid? If you are working with multiple grids, how do you keep track of which grid resolution to reshape with? This will be really important for me later too when I need to export the model outputs to gdal.

The custom grid has no "resolution" property, it is basically just x,y and z coordinates that you can set if you want your model to be calculated at certain locations. Because the custom grid, in this case, are the topography coordinates, I used the resolution of the topography.

Thanks so much for all the help!

Also, a separate problem: When you compute the model initially (block 31), are you able to do that with the topography set? For me, gp.compute_model(geo_model, compute_mesh=True) gives me a dimension error (but it only happens if I try to set the topography from a raster file):

I think I more or less fixed that already and pushed it to the master branch. If you cloned gempy via git, you should not get this error after the update :)

atwahsz commented 5 years ago

interesting work! . i hope your issue will get fixed soon!

cfandel commented 5 years ago

OK this makes a lot of sense - thank you! The only issue I am still having now is the dimension error when compute_mesh is set to True. I currently have gempy v2.0b0.dev2, installed via pip, to deal with the sci-kit image version requirements. Is it possible for me to get your update using pip?

Leguark commented 5 years ago

the fix I did on the gempy side not on sci-kit. I am bit confused you still have error. Can you paste it here?

cfandel commented 5 years ago

It's the same one as above - whenever I set a resolution where xres does not equal yres, I get a dimension error. The following code:

xmin = 578287.5
xmax = 590837.5
ymin = 5240062.5
ymax = 5250062.5
xres = 200
yres = 150
zres = 52
dx   = (xmax-xmin)/xres
dy   = (ymax-ymin)/yres
dz   = (zmax-zmin)/zres

gp.init_data(geo_model, [xmin,xmax, ymin,ymax, zmin,zmax], [xres,yres,zres],  path_o = orfile,  path_i = intfile)
gp.map_series_to_surfaces(geo_model, {'strat': unitNames}) 
geo_model.set_topography(source='gdal', filepath=dem_path) 
geo_model.grid.set_active('topography') 
geo_model.grid.set_active('regular')
gp.set_interpolation_data(geo_model, output='geology', theano_optimizer='fast_compile')
sol = gp.compute_model(geo_model, compute_mesh=True)

gives the following error (only when compute_mesh=True):

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-3dfd6c4329d9> in <module>
      3 #gp.set_interpolation_data(geo_model, theano_optimizer='fast_compile') #compile theano and interpolate the input data
      4 
----> 5 sol = gp.compute_model(geo_model, compute_mesh=True)               #compute the model from the data - returns an output (sol) which includes lith and faults, each with two arrays, one for the id of the formation in each cell, one with orientations in each cell
      6 #need to set compute_mesh=True in order to see the interfaces in 3D
      7 #if compute_mesh=False, model computes geologic map but not interfaces in 3D

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\gempy_api.py in compute_model(model, output, compute_mesh, reset_weights, reset_scalar, reset_block, sort_surfaces, debug, set_solutions)
    424     elif set_solutions is True:
    425         if model.grid.active_grids[0] is np.True_:
--> 426             model.solutions.set_solution_to_regular_grid(sol, compute_mesh=compute_mesh)
    427         # TODO @elisa elaborate this
    428         if model.grid.active_grids[2] is np.True_:

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in set_solution_to_regular_grid(self, values, compute_mesh)
     89         if compute_mesh is True:
     90 
---> 91             self.compute_all_surfaces()
     92             # except RuntimeError:
     93             #     warnings.warn('It is not possible to compute the mesh.')

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in compute_all_surfaces(self, **kwargs)
    221         self.vertices = []
    222         self.edges = []
--> 223         self.padding_mask_matrix()
    224         series_type = self.series.df['BottomRelation']
    225         s_n = 0

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in padding_mask_matrix(self, mask_topography)
    187 
    188             if mask_topography and self.grid.regular_grid.mask_topo.size != 0:
--> 189                 mask_pad = self.mask_topo(mask_pad)
    190 
    191             self.mask_matrix_pad.append(mask_pad.T)

~\Anaconda3\envs\gp2\lib\site-packages\gempy\core\solution.py in mask_topo(self, mask_matrix)
    173 
    174     def mask_topo(self, mask_matrix):
--> 175         a = (~self.grid.regular_grid.mask_topo) * mask_matrix
    176         return a
    177 

ValueError: operands could not be broadcast together with shapes (150,200,52) (200,150,52) 
Leguark commented 5 years ago

Is this solved @elimh ? If yes, close the issue please!

elimh commented 5 years ago

I don't know, is it working for you now, @cfandel?

cfandel commented 5 years ago

Yes sorry was out for fieldwork & away from internet-land. :) Still need to do some tests with the new updates you made, but I think it should work fine now. Will let you know if I run into any more problems!