yt-project / yt

Main yt repository
http://yt-project.org
Other
457 stars 274 forks source link

Creating a geometric object from a cut_region object does not respect the cut_region boundaries for particles. #1933

Open RicardaBeckmann opened 5 years ago

RicardaBeckmann commented 5 years ago

Bug report

Bug summary When using a cut_region object as a data_source for a (larger) geometric object, the number of particles in the final object is larger than the number of particles in the original cut_region.

For example, if I define a cut_region that only contains 50 stars, and entirely encompass it in a ds.box object, the new box will suddenly contain 138 stars. If I make a sphere that only contains 50 stars, and encompass it in a larger box, the larger box will still only contain 50 stars.

The example code below demonstrates this in an extreme case. Using my full simulation, the number of particles I get for the "cut region in box" situation isn't necessarily equal to the total number of particles, as in the example here. Instead, it is always larger but not by a consistent factor and not by an integer either. I've seen 27.2 times the original number of particles, and 98.33 times. The new number of particles is consistent for a given set of input parameters.

Code for reproduction

import yt

ds=yt.load('ramses_new_format/output_00002/info_00002.txt')
ad=ds.all_data()

print("Number of particles in the whole box",len(ad['io','particle_mass']))

#Geometric region in geometric region                                                                                                
print("-- Geometric region in geometric region")
sphere=ds.sphere([0.5,0.5,0.5],ds.quan(0.2,'code_length'))
print("Number of particles in a small sphere",len(sphere['io','particle_mass']))
sphere_in_box=ds.box([0.0,0.0,0.0],[1.0,1.0,1.0],data_source=sphere)
print("Number of particles in a box based on the sphere",len(sphere_in_box['io','particle_mass']))

#Cut region in geometric region                                                                                                      
print("-- Cut region in geometric region")
cut_region=ad.cut_region("obj['density'].in_units('amu/cm**3')<1")
print("Number of particles in the cut region",len(cut_region['io','particle_mass']))
cut_region_in_box=ds.box([0.0,0.0,0.0],[1.0,1.0,1.0],data_source=cut_region)
print("Number of particles in a box based on the cut_region",len(cut_region_in_box['io','particle_mass']))

Output

yt : [INFO     ] 2018-07-31 18:08:20,648 Parameters: current_time              = 0.0201172518911859
yt : [INFO     ] 2018-07-31 18:08:20,648 Parameters: domain_dimensions         = [32 32 32]
yt : [INFO     ] 2018-07-31 18:08:20,649 Parameters: domain_left_edge          = [ 0.  0.  0.]
yt : [INFO     ] 2018-07-31 18:08:20,649 Parameters: domain_right_edge         = [ 1.  1.  1.]
yt : [INFO     ] 2018-07-31 18:08:20,650 Parameters: cosmological_simulation   = 0
yt : [WARNING  ] 2018-07-31 18:08:20,653 Detected 1 extra gravity fields.
yt : [INFO     ] 2018-07-31 18:08:21,951 Adding particle_type: DM
yt : [INFO     ] 2018-07-31 18:08:21,969 Adding particle_type: star_tracer
yt : [INFO     ] 2018-07-31 18:08:21,989 Adding particle_type: dust_tracer
yt : [INFO     ] 2018-07-31 18:08:22,010 Adding particle_type: gas_tracer
yt : [INFO     ] 2018-07-31 18:08:22,033 Adding particle_type: cloud
yt : [INFO     ] 2018-07-31 18:08:22,055 Adding particle_type: cloud_tracer
yt : [INFO     ] 2018-07-31 18:08:22,075 Adding particle_type: dust
yt : [INFO     ] 2018-07-31 18:08:22,097 Adding particle_type: star
Number of particles in the whole box 600
-- Geometric region in geometric region
Number of particles in a small sphere 0
Number of particles in a box based on the sphere 0
-- Cut region in geometric region
Number of particles in the cut region 0
Number of particles in a box based on the cut_region 600

Expected outcome

I would expect the number of particles contained in an object to be determined by the smallest or most restrictive object used to build it. I.e. when a data_source is used to define a new object, I would expect the number of particles in the new object to be equal or smaller than the number of particles in the original object, but never bigger.

For the code-snippet, this means that the code should have reported 0 particles in the last printed line.

Version Information

ngoldbaum commented 5 years ago

I think this might be hard to implement in general, since the cut_region is a special case in a number of ways.

The first place to look would be to see how the data access works at a low level. In principle it should be using the CutRegionSelector to do the underlying geometric selection but it's possible that we've special cased things for the cut region in a way that avoids this code path, since the CutRegionSelector did not exist when we first implemented cut regions.

Also note that the CutRegionSelector has weird issues associated with it because it's sensitive to floating point round-off issues (see e.g. https://github.com/yt-project/yt/issues/1646).

I think this will be hard to fix in time for yt 3.5 so I've marked the milestone as 4.0.