Azure / service-fabric-mesh-preview

Service Fabric Mesh is the Service Fabric's serverless offering to enable developers to deploy containerized applications without managing infrastructure. Service Fabric Mesh , aka project “SeaBreeze” is currently available in private preview. This repository will be used for tracking bugs/feature requests as GitHub issues and for maintaining the latest documentation.
MIT License
82 stars 12 forks source link

Reliable Volumes ignored #336

Open aloneguid opened 5 years ago

aloneguid commented 5 years ago

It seems that reliable volume resource is completely ignored on deployment. Local cluster just hangs and fails, Azure deployment succeeds, however there are no volume resources listed on the portal. Here is my yaml definition:

...
        codePackages:
        - name: xxx
          image: xxx:dev
          volumes:
          - name: sfvol
            creationParameters:
              kind: ServiceFabricVolumeDisk
              sizeDisk: small
            destinationPath: c:\app\data
..

which compiles to ARM as per docs:

…
                  "volumes": [
                    {
                      "name": "sfvol",
                      "creationParameters": {
                        "kind": "ServiceFabricVolumeDisk",
                        "sizeDisk": "small"
                      },
                      "destinationPath": "c:\\app\\data"
                    }
                  ],
...

Theres nothing after deployment to Azure:

image

mgrabarz commented 5 years ago

I think Volumes can be found (in Azure Portal) at the replica level, since this is the place where you mount them.

aloneguid commented 5 years ago

It's not in replicas view either, I literally checked everything.

mgrabarz commented 5 years ago

I just tried one of my templates, and it should be level below (CodePackages) but list is empty in my case. Then I checked mounts (run linux container) and I can see it: _/dev/sfblkdev11 /data ext4 rw,relatime,data=ordered 0 0

mattrowmsft commented 5 years ago

I'm guessing this is a bug in the portal not displaying the field. You can try REST command to see full JSON being returned.

I need to check what local development experience with SF Volumes should be. It might be a limitation of the feature for now.

aloneguid commented 5 years ago

@mattrowmsft just wondering if you have any update on this?

Also just wondering if you can expose any mechanics behind reliable volumes. Are they based on reliable collections internally or how does it work / how reliable is it? Will we be able to back up reliable volumes in any way similar to what we can do with RC in full SF?

Is reliable volume shared between all replicas or a service or is it just one replica?

What's the guarantee that a reliable volume won't go away on next upgrade or when number of replicas changes?

mattrowmsft commented 5 years ago

@mevora-msft do you know about dev experience and plans for backup/restore of sf volumes?

mevora-msft commented 5 years ago

"reliable volumes" internally uses the SF reliable collection to store & replicate the data. As on today, we don't support backup/restore but this is one of the backlog items in our list.

Just to be clear here, "sf mesh replica" represents the separate instance of the service. So for example, if you specify "replica" count "2", you would have two separate instances of the services running. "reliable volumes" are not shared across replica/service. They are exclusive to the one instance of the replica. As on today, sfvolumedisk app deployment fails if the "replica count" is specified >1. This is a known issue and will be fixed in future releases.

Internally, "reliable volumes" data are replicated on 3 nodes using SF reliable collection. During a failover/upgrade, SF ensures that the container moves to the node where the up-to-date data exists.

BTW, I see that you ran into an issue while deploying the app on one-box. Is it working now? /cc @anantshankar17

aloneguid commented 5 years ago

@mevora-msft OneBox deployment just hangs on "Deploying application to local Service Fabric cluster..." The cluster has two applications running fabric:/ServiceFabricVolumeDriver and fabric:/AzureFilesVolumePlugin. As soon as I remove volume definition deployment works again.

anantshankar17 commented 5 years ago

Hello @aloneguid, can you try with "sizeDisk: Small" instead of "sizeDisk: small" ? Probably that is why the volume resource is getting ignored altogether.

aloneguid commented 5 years ago

Thanks @anantshankar17 unfortunately capitalising it doesn't make a different in OneBox scenario. I can't try to deploy to azure yet as I'm working in a car with 3g connection. If you have managed to make it work do you mind sharing a sample solution for VS?

anantshankar17 commented 5 years ago

These are the 3 yamls (renamed to .txt) that I used to deploy, kindly try these and let us know if it still fails. Also please share the traces from: C:\SfDevCluster\Log\Traces. Also do this before deploying this app: docker pull seabreeze/azure-mesh-counter:0.1-nanoserver-1709

app.txt network.txt service.txt

aloneguid commented 5 years ago

@anantshankar17 thanks for this, however I'm trying from VS as from Visual Studio, it has an extension to build Mesh applications.

anantshankar17 commented 5 years ago

@aloneguid all you have to do is create an SFMesh application, you can choose "console app" and then in the service.yaml, just add the highlighted section: service.yaml.txt I would suggest you to copy the volumes section from the attached service.yaml.txt to avoid indentation issues from the sample below:

Service definition

application: schemaVersion: 1.0.0-preview2 name: Application9 properties: services:

aloneguid commented 5 years ago

That's what I did, as stated in the original question. As you can see it transforms to ARM properly therefore I'm sure that dentation is correct. Have you tried to do this in Visual Studio?

anantshankar17 commented 5 years ago

Yes. The same works in Visual Studio for me on a local box setup, I need to look at your traces to see why it doesn't for you. Also please share the VS output from "Service Fabric Tools".

aloneguid commented 5 years ago

Thanks, let me get this ready.

aloneguid commented 5 years ago

@anantshankar17 here is a simple VS solution attached MeshVolumesTest.zip. I'm completely ignoring the fact it's not working in OneBox mode, will come back to it later. VolumeService in Azure will be stuck in "waiting" state for hours:

image

There are no volumes associated with it either:

image

and the full ARM template used to deploy this:

{
  "$schema": "http://schema.management.azure.com/schemas/2014-04-01-preview/deploymentTemplate.json",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "location": {
      "defaultValue": "SouthCentralUS",
      "type": "String",
      "metadata": {
        "description": "Location of the resources."
      }
    },
    "imageRegistryCredential_server": {
      "type": "string"
    },
    "imageRegistryCredential_username": {
      "type": "string"
    },
    "imageRegistryCredential_password": {
      "type": "securestring"
    }
  },
  "resources": [
    {
      "apiVersion": "2018-09-01-preview",
      "name": "MeshVolumesTest",
      "type": "Microsoft.ServiceFabricMesh/applications",
      "location": "[parameters('location')]",
      "dependsOn": [
        "Microsoft.ServiceFabricMesh/networks/MeshVolumesTestNetwork"
      ],
      "properties": {
        "services": [
          {
            "name": "VolumeService",
            "properties": {
              "description": "VolumeService description.",
              "osType": "Windows",
              "codePackages": [
                {
                  "name": "VolumeService",
                  "image": "algsfmesh.azurecr.io/volumeservice:20190122152304",
                  "volumes": [
                    {
                      "name": "sfvol",
                      "creationParameters": {
                        "kind": "ServiceFabricVolumeDisk",
                        "sizeDisk": "Small"
                      },
                      "destinationPath": "c:\\app\\data"
                    }
                  ],
                  "resources": {
                    "requests": {
                      "cpu": 1.0,
                      "memoryInGB": 1.0
                    }
                  },
                  "imageRegistryCredential": {
                    "server": "[parameters('imageRegistryCredential_server')]",
                    "username": "[parameters('imageRegistryCredential_username')]",
                    "password": "[parameters('imageRegistryCredential_password')]"
                  }
                }
              ],
              "replicaCount": 1,
              "networkRefs": [
                {
                  "name": "[resourceId('Microsoft.ServiceFabricMesh/networks', 'MeshVolumesTestNetwork')]"
                }
              ]
            }
          }
        ],
        "description": "MeshVolumesTest description."
      }
    },
    {
      "apiVersion": "2018-09-01-preview",
      "name": "MeshVolumesTestNetwork",
      "type": "Microsoft.ServiceFabricMesh/networks",
      "location": "[parameters('location')]",
      "dependsOn": [],
      "properties": {
        "description": "MeshVolumesTestNetwork description.",
        "networkAddressPrefix": "10.192.0.0/16",
        "kind": "Local"
      }
    }
  ],
  "outputs": {}
}
anantshankar17 commented 5 years ago

@aloneguid the ARM json seems fine. @mevora-msft / @mattrowmsft to check why this deployment is stuck.

For the localbox, I tried the VS project that you shared and it works fine for me, thatswhy I'm asking you to share the traces from C:\SFDevCluster\Log\Traces and the "Build" and "Service Fabric Tools" output from VS, to find out what is stuck on your machine. image