Azure / usql

U-SQL Examples and Issue Tracking
http://usql.io
MIT License
234 stars 685 forks source link

Unable to install U-SQL Extensions #157

Closed karthiksubramanians closed 4 years ago

karthiksubramanians commented 5 years ago

I was trying to install the U-SQL Extensions for Python using the recommended steps:

  1. Going to sample scripts under Datalake analytics
  2. Choosing to install the 2.5GB extensions

I am receiving an error message stating:

Error
An unexpected error occurred.

How should I debug the reasons for this error to install U-SQL extensions. Without installing in the portal, I am unable to install it locally in Visual Studio.

Alternatively, is there an official download page for u-sql extensions dll file? (Specifically my current requirement is for the ExtPy.dll file which is used to create the assembly ExtPython in U-SQL.)

Any response to assist is appreciated.

karthiksubramanians commented 4 years ago

Any updates on this?

ac2707 commented 4 years ago

Facing the same issue - any updates ? please advise

Thanks

asears commented 4 years ago

I was able to successfully install these extensions a few times. Is this with a new account? What location?

It looks like this forum isn't actively maintained and the roadmap for U-SQL and Azure Data Lake Analytics is unclear.

https://feedback.azure.com/forums/327234-data-lake/suggestions/36445702-add-support-for-adls-gen2-to-adla

Should a notice be put up on the readme? @MikeRys @saveenr ?

There are many examples online related to USQL, and Azure Data Lake Analytics would be the obvious option a user unfamiliar with the services would try and install, when creating an Azure Data Lake (Gen 1 or 2).

It's unfortunate that it still does not work with ADLS Gen2, as there are some use cases where it suits workloads much better than Spark. It would be interesting if the ADLA service and Extensions could be open-sourced as part of this codebase. Might be some use cases for it with Kubernetes.

From the docs

Microsoft supports several Analytics services such as Azure Databricks and Azure HDInsight as well as Azure Data Lake Analytics. We hear from developers that they have a clear preference for open-source-solutions as they build analytics pipelines. To help U-SQL developers understand Apache Spark, and how you might transform your U-SQL scripts to Apache Spark, we've created this guidance.

If you are using Python, I would suggest Databricks, Azure Functions, or Azure Kubernetes Services as a platform over U-SQL.

If you would like to use C# syntax, have a look at the new dotnet library. Spark.NET

If there is an issue with the portal and ADLA functionality, you could get faster feedback opening a ticket with Microsoft support.

Here's my resource definition in case it helps.

    "firewallAllowAzureIps": "Disabled",
    "firewallRules": [],
    "virtualNetworkRules": [],
    "debugDataAccessLevel": "All",
    "defaultDataLakeStoreAccount": "myadla",
    "dataLakeStoreAccounts": [
      {
        "properties": {
          "suffix": "azuredatalakestore.net"
        },
        "name": "myadla"
      }
    ],
    "publicDataLakeStoreAccounts": [
      {
        "properties": {
          "suffix": "azuredatalakestore.net"
        },
        "name": "adltrainingsampledata"
      },
      {
        "properties": {
          "suffix": "azuredatalakestore.net"
        },
        "name": "ghinsights"
      }
    ],
    "storageAccounts": [],
    "maxDegreeOfParallelism": 250,
    "maxJobCount": 20,
    "systemMaxDegreeOfParallelism": 250,
    "systemMaxJobCount": 20,
    "maxDegreeOfParallelismPerJob": 250,
    "minPriorityPerJob": 1,
    "computePolicies": [],
    "queryStoreRetention": 30,
    "hiveMetastores": [],
    "currentTier": "Consumption",
    "newTier": "Consumption",
    "provisioningState": "Succeeded",
    "state": "Active",
    "endpoint": "myadla.azuredatalakeanalytics.net",
    "accountId": "xxx",
    "creationTime": "2017-05-26T00:11:50.5317536Z",
    "lastModifiedTime": "2017-05-26T00:11:50.5317536Z"
  },
  "location": "eastus2",
  "id": "/subscriptions/xxx/resourceGroups/datalakerg/providers/Microsoft.DataLakeAnalytics/accounts/hydroadla",
  "name": "myadla",
  "type": "Microsoft.DataLakeAnalytics/accounts"
}
asears commented 4 years ago

Note, you can download compiled extpy.dll assembly after it is successfully installed in the Data Lake Analytics Storage account. Have not seen any official distribution channel.

Try creating a new ADLA account in eastus2 ADLA location and installing the extensions.

mik1893 commented 4 years ago

still not working