Western US/Cascadia model with post snap_segments commit

brendanjmeade commented 2 years ago

@jploveless I've pulled the latest commit and am hitting a bug when running a Western US/Cascadia model with it. The command I'm using is:

python celeri_solve.py ../data/command/western_north_america_command.json --repl 1 --plot_estimation_summary 1 --plot_input_summary 1 --solve_type dense --pickle_save 0 --reuse_elastic 1

and the error I'm recieving is:

Traceback (most recent call last):

> File "/Users/meade/Desktop/celeri/notebooks/celeri_solve.py", line 47, in <module>
    main(args)
    │    └ {'command_file_name': '../data/command/western_north_america_command.json', 'segment_file_name': None, 'station_file_name': N...
    └ <function main at 0x1041e7a30>

  File "/Users/meade/Desktop/celeri/notebooks/celeri_solve.py", line 26, in main
    estimation, operators, index = celeri.build_and_solve_dense(
                │                  │      └ <function build_and_solve_dense at 0x1431997e0>
                │                  └ <module 'celeri' from '/Users/meade/Desktop/celeri/celeri/__init__.py'>
                └ {'meshes': [{}], 'slip_rate_to_okada_to_velocities': array([[-1.11429151e-06, -4.65651451e-10, -1.21282842e-06, ...,
                          ...

  File "/Users/meade/Desktop/celeri/celeri/celeri.py", line 4542, in build_and_solve_dense
    index, estimation = assemble_and_solve_dense(
                        └ <function assemble_and_solve_dense at 0x143199000>

  File "/Users/meade/Desktop/celeri/celeri/celeri.py", line 3232, in assemble_and_solve_dense
    estimation.operator = get_full_dense_operator(operators, meshes, index)
    │                     │                       │          │       └ {'n_stations': 1686, 'vertical_velocities': array([   2,    5,    8, ..., 5051, 5054, 5057]), 'n_blocks': 31, 'n_block_constr...
    │                     │                       │          └ [{'meshio_object': <meshio mesh object>
    │                     │                       │              Number of points: 1916
    │                     │                       │              Number of cells:
    │                     │                       │                vertex: 939
    │                     │                       │                line: 529
    │                     │                       │                triangl...
    │                     │                       └ {'meshes': [{}], 'slip_rate_to_okada_to_velocities': array([[-1.11429151e-06, -4.65651451e-10, -1.21282842e-06, ...,
    │                     │                                 ...
    │                     └ <function get_full_dense_operator at 0x143198f70>
    └ {'data_vector': array([-38.713,  50.072, -66.958, ...,   0.   ,   0.   ,   0.   ]), 'weighting_vector': array([4.70541735, 4....

  File "/Users/meade/Desktop/celeri/celeri/celeri.py", line 3178, in get_full_dense_operator
    operators.rotation_to_velocities[index.station_row_keep_index, :]
    │                                └ {'n_stations': 1686, 'vertical_velocities': array([   2,    5,    8, ..., 5051, 5054, 5057]), 'n_blocks': 31, 'n_block_constr...
    └ {'meshes': [{}], 'slip_rate_to_okada_to_velocities': array([[-1.11429151e-06, -4.65651451e-10, -1.21282842e-06, ...,
              ...

ValueError: operands could not be broadcast together with shapes (3372,90) (3372,93)

to me this looks like somehow it's getting the wrong number of blocks (i.e., 30 vs. 31)? Can you run and confirm this at your end? I'm not sure of the exact source of this but the error seems to appear with the snap_segments push. I haven't changed the input files but perhaps I have to for snap segments?

jploveless commented 2 years ago

@brendanjmeade I'm confirming this on my end, and also confirming that the model runs without this error when the snap_segments call is commented out in process_segment. Initially I thought it was a hanging segment issue brought about by snap_segments but that's not the case. I'm looking into it!

brendanjmeade commented 2 years ago

@jploveless This sounds like an interesting one. Related to this, I feel like there should be a flag for calling snap_segments. What do you think if adding a binary snap_segments = 0/1 in the command file and that we can wrap the call to snap_segments inside an if statement. I can imagine we could do this mesh by mesh as well. We could put the logic just before the call the snap_segments.

    segment = snap_segments(segment, meshes)

jploveless commented 2 years ago

@brendanjmeade I updated get_rotation_to_velocities_partials to take n_blocks as a second input argument, and n_blocks is passed as data.block.shape[0] (except in the call to calculate floating stations, where it's just 1). What seems to be happening is that, given the segment re-ordering that takes place in snap_segments (mesh edges are appended to the bottom of the edited segment dataframe), the block labeling routine applied labels in a different order. For the Western NA case, this resulted in Juan de Fuca being block label 30 (the 31st or last block), but because there are no stations on JdF, the previous definition of n_blocks = (np.max(station.block_label.values) + 1 was smaller (by 3 columns) than it needed to be. The resulting estimation summary plot looks pretty good, but if you could take a quick look through the code diff to check that this isn't breaking anything else, I'd appreciate it!

jploveless commented 2 years ago

@jploveless This sounds like an interesting one. Related to this, I feel like there should be a flag for calling snap_segments. What do you think if adding a binary snap_segments = 0/1 in the command file and that we can wrap the call to snap_segments inside an if statement. I can imagine we could do this mesh by mesh as well. We could put the logic just before the call the snap_segments.
    segment = snap_segments(segment, meshes)

Agreed! It turns out that for the global model, most of the segments already perfectly conform to the mesh edges; it's only a handful (Nankai, Sagami, Cascadia, maybe 1 or 2 more) that don't.

brendanjmeade commented 2 years ago

@brendanjmeade I updated get_rotation_to_velocities_partials to take n_blocks as a second input argument, and n_blocks is passed as data.block.shape[0] (except in the call to calculate floating stations, where it's just 1). What seems to be happening is that, given the segment re-ordering that takes place in snap_segments (mesh edges are appended to the bottom of the edited segment dataframe), the block labeling routine applied labels in a different order. For the Western NA case, this resulted in Juan de Fuca being block label 30 (the 31st or last block), but because there are no stations on JdF, the previous definition of n_blocks = (np.max(station.block_label.values) + 1 was smaller (by 3 columns) than it needed to be. The resulting estimation summary plot looks pretty good, but if you could take a quick look through the code diff to check that this isn't breaking anything else, I'd appreciate it!

@jploveless I've read through the changes now and it does run for me! I have to update the command line parser to work with save_elastic so that I can test it with precomputed elastic partials. Do think snap_segments will play nicely with precomputed partials? I'm wondering if it does segment reordering that might make this complicated.

brendanjmeade commented 2 years ago

Agreed! It turns out that for the global model, most of the segments already perfectly conform to the mesh edges; it's only a handful (Nankai, Sagami, Cascadia, maybe 1 or 2 more) that don't.

Ok, I'll do this. First I'll have a global override in the command file, and then I'll add flags in the mesh_parameters file for each mesh. Is there a method that is currently used to determine whether snap_segments acts on the boundaries to a particular mesh?

jploveless commented 2 years ago

@brendanjmeade I'm sorry — I did this in https://github.com/brendanjmeade/celeri/commit/fe6599b6ac0ac1b9b72f3e6788e8688963d903dd, but then the tests failed so I had to make a quick edit and therefore the commit message wasn't clear. But I'll now pull your version and hope that things merge fine.

With all of this, I think we should make segment snapping an optional routine outside of celeri.py. Thinking about it some more, it makes sense for it to be really part of the data preparation, rather than part of the analysis. I really think that the default should be that we prepare segment files that have their segments snapped to confirm to any mesh geometries, but this is something that can be done once and saved rather than done within a run. If that sounds reasonable to you, I'm going to strip all references to segment snapping out of celeri.py and make a new snap_segments.py file that can be called from the command line. Sorry for this complexity; I should have done this from the beginning. Turns out how we had it in Blocks was probably the right move: done at the time of saving a .segment file!

jploveless commented 2 years ago

@jploveless Do think snap_segments will play nicely with precomputed partials? I'm wondering if it does segment reordering that might make this complicated.

I would like to go through and allow the pre-computed partials routine to behave more like it did in Blocks, in that only new segments would have their partials freshly calculated. This involves comparing the segment geometry associated with saved partials to the current run's segment geometry. In these cases, segments could be re-ordered, have their properties changes, or have additions/deletions/substitutions, but only truly new segments would need their partials to be recomputed. Some of the logic from snap_segments could be applied to this routine, essentially finding rows containing [lon1 lat1 lon2 lat2 locking_depth dip] that are unique to the current run's segment dataframe.

brendanjmeade commented 2 years ago

@brendanjmeade I'm sorry — I did this in fe6599b, but then the tests failed so I had to make a quick edit and therefore the commit message wasn't clear. But I'll now pull your version and hope that things merge fine.

Ships in the night. I'm glad the argument parsing was reasonable to edit!

With all of this, I think we should make segment snapping an optional routine outside of celeri.py. Thinking about it some more, it makes sense for it to be really part of the data preparation, rather than part of the analysis. I really think that the default should be that we prepare segment files that have their segments snapped to confirm to any mesh geometries, but this is something that can be done once and saved rather than done within a run. If that sounds reasonable to you, I'm going to strip all references to segment snapping out of celeri.py and make a new snap_segments.py file that can be called from the command line. Sorry for this complexity; I should have done this from the beginning. Turns out how we had it in Blocks was probably the right move: done at the time of saving a .segment file!

I think this is a good strategy. We can always change our minds in the future!

brendanjmeade commented 2 years ago

I would like to go through and allow the pre-computed partials routine to behave more like it did in Blocks, in that only new segments would have their partials freshly calculated. This involves comparing the segment geometry associated with saved partials to the current run's segment geometry. In these cases, segments could be re-ordered, have their properties changes, or have additions/deletions/substitutions, but only truly new segments would need their partials to be recomputed. Some of the logic from snap_segments could be applied to this routine, essentially finding rows containing [lon1 lat1 lon2 lat2 locking_depth dip] that are unique to the current run's segment dataframe.

I think this is a fantastic idea and I'm interested in this too for this reporting reasons too. For example, in celeri_report I'd like to print out something like the ~1-10 fault segments with geometry that changed from one run to the next.

brendanjmeade / celeri

Western US/Cascadia model with post snap_segments commit #102