tum-pbs / PhiFlow

A differentiable PDE solving framework for machine learning
MIT License
1.47k stars 192 forks source link

PhiFlow Fluid usage with PyTorch #14

Closed ne3x7 closed 4 years ago

ne3x7 commented 4 years ago

Hello, authors. First of all, thank you for your amazing work!

I can see that experimental PyTorch support has been added. I am trying to use it, but face some errors. Is it meant to be used yet? If so, could you please guide me through my error?

I am trying to use PhiFlow in a PyTorch-friendly style, and am getting an error when Fluid validates input density.

I am trying to do approximately the following:

from PhiFlow.phi.torch.flow import *
import torch
from torch.utils.data import TensorDataset, DataLoader
inputs = torch.rand(1000, 10, 10, 1)
targets = torch.rand(1000, 10, 10, 1)
dataset = TensorDataset(inputs, targets)
dataloader = DataLoader(dataset, batch_size=10)
domain = Domain([10, 10], boundaries=PERIODIC)
sz = domain.staggered_grid(0).staggered_tensor().shape
initial_velocity = torch.zeros(*sz)
initial_velocity = initial_velocity.expand(10, *initial_velocity.shape[1:])
velocity = torch.nn.Parameter(initial_velocity)
optimizer = torch.optim.Adam([velocity], lr=1e-3)
for epoch in range(10):
    for batch_index, (inputs, targets) in enumerate(dataloader):
        optimizer.zero_grad()
        domain = Domain([10, 10], boundaries=PERIODIC)
        fluid = Fluid(domain, batch_size=10, density=inputs, velocity=velocity)
        fluid.density = advect.semi_lagrangian(fluid.density, fluid.velocity, dt=1)
        fluid.density = diffuse(fluid.density, 1 * 0.1, substeps=1)
        loss = math.l2_loss(predictions, targets)
        loss.backward()
        optimizer.step()

And I get the following error:

Error stackstrace ```python --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 16 optimizer.zero_grad() 17 domain = Domain([10, 10], boundaries=PERIODIC) ---> 18 fluid = Fluid(domain, batch_size=10, density=inputs, velocity=velocity) 19 fluid.density = advect.semi_lagrangian(fluid.density, fluid.velocity, dt=1) 20 fluid.density = diffuse(fluid.density, 1 * 0.1, substeps=1) ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/fluid.py in __init__(self, domain, density, velocity, buoyancy_factor, tags, name, **kwargs) 26 27 def __init__(self, domain, density=0.0, velocity=0.0, buoyancy_factor=0.0, tags=('fluid', 'velocityfield', 'velocity'), name='fluid', **kwargs): ---> 28 DomainState.__init__(self, **struct.kwargs(locals())) 29 30 def default_physics(self): return INCOMPRESSIBLE_FLOW ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/physics.py in __init__(self, batch_size, **kwargs) 20 def __init__(self, batch_size=None, **kwargs): 21 self._batch_size = batch_size ---> 22 struct.Struct.__init__(self, **kwargs) 23 24 @struct.constant(default=()) ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/struct/struct.py in __init__(self, content_type, **kwargs) 64 trait.endow(self) 65 if content_type is VALID: ---> 66 self.validate() 67 68 @derived() ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/struct/struct.py in validate(self) 156 """ 157 if not skip_validate() and self.__content_type__ is INVALID: --> 158 self.__validate__() 159 self.__content_type__ = VALID 160 return True ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/struct/struct.py in __validate__(self) 166 trait.pre_validate_struct(self) 167 for item in self.__items__: --> 168 item.validate(self) 169 for trait in self.__traits__: 170 trait.post_validate_struct(self) ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/struct/structdef.py in validate(self, struct) 148 for trait in self.traits: 149 value = trait.pre_validated(struct, self, value) --> 150 value = self.validation_function(struct, value) 151 for trait in self.traits: 152 value = trait.post_validated(struct, self, value) ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/fluid.py in density(self, density) 36 It describes the number of particles per physical volume. 37 """ ---> 38 return self.centered_grid('density', density) 39 40 @struct.variable(default=0, dependencies=DomainState.domain) ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/domain.py in centered_grid(self, name, value, components, dtype) 241 def centered_grid(self, name, value, components=1, dtype=np.float32): 242 extrapolation = Material.extrapolation_mode(self.domain.boundaries) --> 243 return self.domain.centered_grid(value, dtype=dtype, name=name, components=components, batch_size=self._batch_size, extrapolation=extrapolation) 244 245 def staggered_grid(self, name, value, dtype=np.float32): ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/domain.py in centered_grid(self, data, components, dtype, name, batch_size, extrapolation) 138 grid._age = 0.0 139 else: --> 140 grid = CenteredGrid.sample(data, self, batch_size=batch_size) 141 assert grid.component_count == components, "Field has %d components but %d are required for '%s'" % (grid.component_count, components, name) 142 if math.dtype(grid.data) != dtype: ~/.virtualenvs/torch1venv3/lib/python3.6/site-packages/phiflow-1.4.0-py3.6.egg/phi/physics/field/grid.py in sample(value, domain, batch_size, name) 53 else: # value is constant 54 components = math.staticshape(value)[-1] if math.ndims(value) > 0 else 1 ---> 55 data = math.zeros((batch_size,) + tuple(domain.resolution) + (components,)) + value 56 return CenteredGrid(data, box=domain.box, extrapolation=Material.extrapolation_mode(domain.boundaries), name=name) 57 TypeError: add(): argument 'other' (position 1) must be Tensor, not numpy.ndarray ```

I tried to track down where exactly my Tensor becomes numpy.ndarray, but failed to do so. I am getting this error both on master and develop branches.

A couple of comments regarding the piece of code above:

  1. Import looks so because setup.py does not build phi.torch
  2. I use torch.zeros(), because math.zeros() returns numpy.ndarray

Looking forward to your answer, Nikolai

holl- commented 4 years ago

Hi Nikolai,

Thanks for your feedback! Unfortunately we don't have anyone with a lot of experience in PyTorch in our group, so we don't yet have advanced tests.

I'll look into the error shortly and post a fix.

Best, Philipp

holl- commented 4 years ago

I have fixed this issue in the develop branch. The following script now runs until the last line.

from phi.torch.flow import *
from torch.utils.data import TensorDataset, DataLoader

inputs = torch.rand(1000, 10, 10, 1)
targets = torch.rand(1000, 10, 10, 1)
dataset = TensorDataset(inputs, targets)
dataloader = DataLoader(dataset, batch_size=10)
domain = Domain([10, 10], boundaries=PERIODIC)
sz = domain.staggered_grid(0).staggered_tensor().shape
initial_velocity = torch.zeros(*sz)
initial_velocity = initial_velocity.expand(10, *initial_velocity.shape[1:])
velocity = torch.nn.Parameter(initial_velocity)
optimizer = torch.optim.Adam([velocity], lr=1e-3)
for epoch in range(10):
    for batch_index, (inputs, targets) in enumerate(dataloader):
        optimizer.zero_grad()
        domain = Domain([10, 10], boundaries=PERIODIC)
        fluid = Fluid(domain, batch_size=10, density=inputs, velocity=velocity)
        density = advect.semi_lagrangian(fluid.density, fluid.velocity, dt=1)
        density = diffuse(density, 1 * 0.1, substeps=1)
        loss = math.l2_loss(density.data - targets)
        loss.backward()
        optimizer.step()

The optimizer.step() still complains that more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation. It sounds to me like the problem lies in the setup. Can you confirm this?

Note that Fluid.density is read-only. Use fluid.copied_with(density=new_density) to make an altered copy.

ne3x7 commented 4 years ago

Thank you for such a prompt response. Indeed, Fluid object is created correctly now. I also confirm that the problem with optimizer lies in the setup, to be precise, in the line initial_velocity = initial_velocity.expand(10, *initial_velocity.shape[1:]). This operation should be done at each step when creating Fluid object. I am now closing the issue, thank you again!