microsoft / service-fabric-cli

Service Fabric CLI Tools
Other
53 stars 52 forks source link

sfctl upload and provision commands on Linux folder clean-up #112

Open MedAnd opened 6 years ago

MedAnd commented 6 years ago

Running the generated install.sh using the VS Code Service Fabric plugin, I noticed sfctl upload & provision commands are used on Linux but seems folders are not being cleaned-up on failed deployments.

For example looking at the cluster location path(s):

I notice these folders are not being cleaned-up even if the deployment fails. Moreover doing a sfctl upload & provision I can see the AppType in Service Fabric Explorer however the /home/sfuser/sfdevcluster/data/ImageBuilderProxy/AppType folder is empty. I'm assuming I should see my application in the AppType folder?

image

image

Lastly even though I do not see any AppType in Service Fabric Explorer, doing a sfctl store root-info still shows the below, should this also be cleaned-up if an AppType is removed via Explorer? Seems this could cause a stale or the wrong package to be deployed if the old and new application packages have the same AppType and version. So my example is a deployment fails or I remove everything via Service Fabric explorer and using sfctl is deploying a cached package which has the same AppType and version.

image

cc @suhuruli

suhuruli commented 6 years ago

@Christina-Kang

Christina-Kang commented 6 years ago

Thanks for the feedback! There is a distinction between where the data you upload is stored, and the folder locations used by service fabric for provisioning purposes. /home/sfuser/sfdevcluster/data/ImageBuilderProxy/* is used internally, and its cleanup after some failure may be delayed while waiting on a timer, and who's operations should be transparent to the user. The data which you have uploaded via sfctl's upload command is not deleted until you explicitly request it from Service Fabric.

Unprovisioning an application type does not automatically remove its contents from the store, which is why you continue seeing the files in the store after removing the application type, but no such application type in the cluster. That needs to be removed by sfctl store delete. The idea is that Service Fabric does not know for sure if the uploaded content will be needed again.

Hope that clarifies things a bit!

MedAnd commented 6 years ago

Hi @Christina-Kang, appreciate the clarification. Not sure if my issue is a bug or not though.. so just to clarify in more detail:

  1. Create a demo solution called Demo, which packages down into 500 files (dlls etc)
  2. Convert an existing Windows solution with three projects "ALSO" called Demo, which packages down into 800 files (dlls etc)
  3. On Ubuntu Linux first deploy the larger Demo (800 file) solution using install.sh as generated by the VS Code SF plugin - the deployment terminates due to issue Unable to load shared library 'libFabricCommon.so' or one of its dependencies. No Type or Instance are visible via Service Fabric Explorer.
  4. Attempt to then deploy the smaller (500 file) solution which has the same Type and Version, however in Service Fabric explorer notice the larger (800) file solution is being deployed to Service Fabric instead!

I'm suspecting that above is a bug, as the larger 800 file failed solution should have been cleaned up. Instead the smaller Demo solution which has the same Type and Version is being confused for the larger, and the wrong files are being deployed.

The workaround was to cleanup cluster folders manually, followed by removing sfctl store even though nothing is visible in Service Fabric explorer.

PS. Let me know if this makes sense?

Thx.

suhuruli commented 6 years ago

@MedAnd the install.sh script is a wrapper around three separate SF CLI commands. I believe the initial installation is failing at the provision command (which we should investigate anyway) but the upload to the image store was successful. In order to next install the second one, you need to run the command to delete contents of the image store.

suhuruli commented 6 years ago

@MedAnd as a separate discussion point, I notice you have been playing and opening a lot of issues around the developer experience on Linux and mac using our new tooling. If you are interested, we can chat on a call about your specific issues :) We love feedback like this and will actively make fixes asap

MedAnd commented 6 years ago

@suhuruli - what's the best way to setup above discussion?

MedAnd commented 6 years ago

Happy to provide feedback & understand the point about install.sh being a wrapper but not sure the current behaviour is expected... or rather when deployment via powershell fails on Windows I usually do not have to clean-up anything... the powershell way feels more atomic while the linux experience requires me to know I have to compensate for a failed deployment if that makes sense?

suhuruli commented 6 years ago
MedAnd commented 6 years ago

Have sent you an email @suhuruli...

Re above though, wondering how feasible would it be to support PowerShell Core on Linux as a deployment process? For example if we could keep the current VS 2017 project & folder structure cross platform... (with tooling in VS Code off-course)... could PowerShell Core on Linux be used to port the Windows PowerShell Service Fabric deployment process?

This way we would have the same project structure, packaging and deployment process cross platform...

suhuruli commented 6 years ago

I see, this is an option. We do not have any plans to investigate PowerShell Core on Linux tooling. The uber point that you are making of portability of projects created in VS 2017 onto VSCode is one that we will definitely try our best to make easy.

MedAnd commented 6 years ago

I was hoping PowerShell Core on Linux would facilitate the same deployment experience as Service Fabric on Windows (less failure compensation)... but if sfctl is preferable cross-platform I think the current approach might need some tweaking...

Out of interest what is the correct way of cleaning up a failed deployment... for example:

sfctl application upload --path

the command copies a Service Fabric application package to the image store.... I assume this means the package is copied to:

/home/sfuser/sfdevcluster/data/ImageBuilderProxy/App

At this stage I should not see the application in Service Fabric Explorer until I run:

sfctl application provision

if the above command succeeds I should see the package in Service Fabric Explorer & somewhere in:

/home/sfuser/sfdevcluster/data/ImageBuilderProxy/AppType

if the above command failed I manually have to:

sfctl store delete --content-path

Thx

suhuruli commented 6 years ago

This is correct. The uninstall.sh scripts contain the commands to run to completely clean an application.

MedAnd commented 6 years ago

@suhuruli - within the .sh scripts, could the result of "sfctl application provision" be used to determine if a "sfctl store delete --content-path" automatically needs to be done... that way it would appear as if deployment were an atomic operation or would this be dangerous?

PS. Happy to close this issue as the current approach on Linux seems to be by design, hoping it can be tweaked at some stage to be more atomic?

suhuruli commented 6 years ago

Hi @MedAnd,

I think this would be possible if we returned some sort of meaningful message in the first request. This would require some custom scripts.

@Christina-Kang , what say you on this?

Thanks, Sudhanva

Christina-Kang commented 6 years ago

Hi @MedAnd,

Provision application, including as part of the .sh script, does not do clean up by design, since service fabric doesn't know if the user will want the application packages again, even after a failed deployment. For example, if a single file is causing the error, instead of deleting and starting over again, a single file can be replaced instead. In both PowerShell and Linux, clean up should be a sepa`rate step. Can you elaborate on the atomic-ness of the PowerShell experience?

That being said, we can definitely look into how we can make it more explicit that a delete is required as a separate step.

Thanks! Christina

MedAnd commented 6 years ago

Hi @Christina-Kang, appreciate the ability to provide feedback... with PowerShell and VS 2017 on Windows, as a developer I do not need to delete a failed deployment. That is, I make the necessary changes and publish again from VS 2017.

On Linux though, using sfctl + VS code, I have to explicitly issue a "sfctl store delete --content-path" or run the uninstall script for a failed install.sh attempt. The two experiences feel different and coming from Windows not what I expected.

anantshankar17 commented 6 years ago

Thanks for the feedback. We will fix this to make the behavior consistent across platforms/tools.

MedAnd commented 5 years ago

Hi @suhuruli, nice to see alignment across platforms using PowerShell Core but wondering if-when the 2nd part is coming, which will re-align the VS 2017 project & folder structures to be the same across platforms?

suhuruli commented 5 years ago

Hey, you're referring to VS and VSCode project structures being compatible? This work will land soon. The first part we are working on is ensuring that SF Mesh in VS and VSCode are in sync. Following this, we will ensure compatibility between the plugins for Reliable Services.

MedAnd commented 5 years ago

Nice... so the vision of taking a SF project built in VS on Windows & compiling in VSCode on Linux is coming... any rough ETAs ☺️