Azure / azure-functions-java-library

Contains annotations for writing Azure Functions in Java
MIT License
43 stars 43 forks source link

Time out when i access Azure Data Lake Gen2 in Azure Function with ADLS java sdk V12 #113

Open gjjtip opened 4 years ago

gjjtip commented 4 years ago
My function code as below:

package org.example;

import com.azure.core.http.rest.PagedIterable;
import com.azure.storage.common.StorageSharedKeyCredential;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
import com.azure.storage.file.datalake.DataLakeServiceClientBuilder;
import com.azure.storage.file.datalake.models.ListPathsOptions;
import com.azure.storage.file.datalake.models.PathItem;
import com.microsoft.azure.functions.*;
import com.microsoft.azure.functions.annotation.AuthorizationLevel;
import com.microsoft.azure.functions.annotation.FunctionName;
import com.microsoft.azure.functions.annotation.HttpTrigger;

import java.util.Iterator;
import java.util.Optional;

/**
 * Azure Functions with HTTP Trigger.
 */
public class Function {

    public static String accountName = "***";
    public static String accountKey = "***";

    @FunctionName("HttpExample")
    public HttpResponseMessage run(
            @HttpTrigger(name = "req", methods = {HttpMethod.GET, HttpMethod.POST}, authLevel = AuthorizationLevel.ANONYMOUS) HttpRequestMessage<Optional<String>> request,
            final ExecutionContext context) {
        context.getLogger().info("Java HTTP trigger processed a request.+++++");

        DataLakeServiceClient dataLakeServiceClient = GetDataLakeServiceClient(accountName, accountKey);
        DataLakeFileSystemClient dataLakeFileSystemClient = GetFileSystem(dataLakeServiceClient);

        String name = ListFilesInDirectory(dataLakeFileSystemClient, context);

        if (name == null) {
            return request.createResponseBuilder(HttpStatus.BAD_REQUEST).body("Please pass a name on the query string or in the request body").build();
        } else {
            return request.createResponseBuilder(HttpStatus.OK).body("List File Names:, " + name).build();
        }
    }

    static public DataLakeServiceClient GetDataLakeServiceClient
            (String accountName, String accountKey) {

        StorageSharedKeyCredential sharedKeyCredential =
                new StorageSharedKeyCredential(accountName, accountKey);

        DataLakeServiceClientBuilder builder = new DataLakeServiceClientBuilder();

        builder.credential(sharedKeyCredential);
        builder.endpoint("https://" + accountName + ".dfs.core.windows.net");

        return builder.buildClient();
    }

    static public DataLakeFileSystemClient GetFileSystem
            (DataLakeServiceClient serviceClient) {

        return serviceClient.getFileSystemClient("test");
    }

    static public String ListFilesInDirectory(DataLakeFileSystemClient fileSystemClient, ExecutionContext context) {

        ListPathsOptions options = new ListPathsOptions();
        options.setPath("");
        fileSystemClient.listPaths().forEach( path -> context.getLogger().info(path.getName()));

        return "ABC";
    }
}

The java sdk link: https://azuresdkdocs.blob.core.windows.net/$web/java/azure-storage-file-datalake/12.0.0-preview.6/index.html

Based on my own troubleshooting, the code always hangs on the line : fileSystemClient.listPaths().forEach( path -> context.getLogger().info(path.getName()));

No any responses here. I tested above code outside Azure function, let's say the Main method,everything works. So i think the code is fine. Anyone met the same issue as same as me. Is there any restrictions in the Azure Function to access ADLS gen2 with java sdk? Please help.

DracoTaffi commented 4 years ago

I have the same issue .. I can get the FileSystemClient and DirectoryClient objects but time out checking the existence whatever the value of this timeout:

DataLakeFileSystemClient adlsfsClient = adlsClient.getFileSystemClient(fileSystem);
DataLakeDirectoryClient adlsRootDirClient = adlsfsClient.getDirectoryClient("myrootdir");
Duration timeout = Duration.ofMillis(10000);
if (!adlsRootDirClient.existsWithResponse(timeout, Context.NONE).getValue()) {
    adlsRootDirClient.create();
}

Same as you, running this code from my local PC works fine !!

joshfree commented 4 years ago

I've ported this GitHub issue to the azure/azure-sdk-for-java repo where the Storage team will be able to follow up, shortly.

https://github.com/Azure/azure-sdk-for-java/issues/13846

amamounelsayed commented 4 years ago

Please can you follow this issue steps and check if the issue still exists https://github.com/Azure/azure-functions-java-worker/issues/381

scgbear commented 4 years ago
Internal Tracking devdivcsef 371565
amamounelsayed commented 4 years ago

@gjjtip Please let us know if you still face this issue. Thank you so much!

s3vhub commented 3 years ago

i can confirm that this issue still exists. using dependency 12.2.0 for azure datalake the function freezes while running the function app if a createdirectory is called. Tried also with blob storage sdk for ADLS.Face the same issue when trying to use the new Xssfworkbook(blobclient.inputstream()); Here the function app just freezes and time out giving 503 service unavailable whereas at the same time it works fine if run via a java main function locally. now using azure cloud storage sdk serves as a workaround to read the stream but its not a solution as i need to list the paths in the container/directory

amamounelsayed commented 3 years ago

@s3vhub Did you try to add the application setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS True or 1? Please can you share more information related to the run time version, application name?

s3vhub commented 3 years ago

yes i tried this today by setting the value to 1 and it threw a lot if errors as if the jars werent loading. the jdk version is jdk8. runtime version is 3 on the function app and the application name is xxx-parser but it doesnt seem to be application name specific

It also happens with the azure blob storage sdk that i tried with.

s3vhub commented 3 years ago

@s3vhub Did you try to add the application setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS True or 1? Please can you share more information related to the run time version, application name?

please elaborate as to what more info you will need?

amamounelsayed commented 3 years ago

Thank you @s3vhub we will need these information, this will be helpful. Please provide the following:

For the runtime version, in the UI you will see it 3.x.x, in the over view section. Is it linux/windows? And if it is dedicated/premium or consumption. Can you share what is the errors you had when you set the value 1?

s3vhub commented 3 years ago

Thank you @s3vhub we will need these information, this will be helpful. Please provide the following:

  • Timestamp:
  • Function App name:
  • Function name(s) (as appropriate):
  • Invocation ID:
  • Region:

Can you share what is the errors you had when you set the value 1?

  1. since this environment belongs to the customer i can share the timestamp and region and the app name partially.would that be enough?
  2. However i have a question.Does the above mentioned application setting work for the azure-blob-storage sdk as well that i can use to read from the ADLS or is it specific to the datalaske gen 2 sdk?
amamounelsayed commented 3 years ago

This setting basically, as mentioned in issue https://github.com/Azure/azure-functions-java-worker/issues/381 will make sure that we will not have conflict between SDK jars and Azure functions.

If you can share region, with invocation Id, and partial app name we can check how much information we can gather.

s3vhub commented 3 years ago

This setting basically, as mentioned in issue Azure/azure-functions-java-worker#381 will make sure that we will not have conflict between SDK jars and Azure functions.

If you can share region, with invocation Id, and partial app name we can check how much information we can gather.

so for azure blob storage sdk i have added the FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS = True. i had earlier added the value as 1 which was throwing the error.However trying out with the value as True seems to have fixed the issue. used - java 8 azure func version 3 azure-storage-blob 12.6.0

app service plan for the function app I have yet to try it for datalake sdk but i can confirm it works for blob storage sdk to read from the blob in ADLS created like container/dir1/dir2/yyyy/MM/dd/file.xls.

amamounelsayed commented 3 years ago

Thank you @s3vhub, please if you can retry the same working app with 1 again? As both should work https://github.com/Azure/azure-functions-java-worker/blob/dev/src/main/java/com/microsoft/azure/functions/worker/Util.java#L5

Please let us know once you confirm the datalake.

Thank you so much for your support!

s3vhub commented 3 years ago

sorry for the delay.will check and update

piyushdubey commented 3 months ago

I am seeing this issue with Java 11 as well. I have tried setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS to true / 1. Any idea if there is a fix available for this yet?

OS = Linux Java runtime = 11.

Maven dependency for azure-storage-file-datalake = 12.20.0

<dependency>
            <groupId>com.azure</groupId>
            <artifactId>azure-storage-file-datalake</artifactId>
            <version>12.20.0</version>
        </dependency>