Open aashishgaba-cn opened 1 year ago
Hey @aashishgaba-cn thanks for raising the issue. It seems like you included rsa key as parameter overrides in samconfig.toml
. I wonder if large chunks of parameter overrides could be the bottleneck... Can you provide your samconfig.toml
and template.yaml
for us to reproduce the issue? And what is the size of your samconfig.toml
file?
You can scrub out the sensitive information but I mainly would like understand the structure of your samconfig.toml.
Hey @hawflau , sure
version=0.1
[default.global.parameters]
stack_name = ""
region = "us-east-1"
parameter_overrides=[
"ToolkitProjectName=''abc",
"ToolkitProjectId=''abc",
"ToolkitProjectVersion='0.0.1'",
"S3Bucket='abc'",
"RawDocS3KeyPrefix=folders/{folder_id}/docs/raw",
"TransformedDocS3KeyPrefix=folders/{folder_id}/docs/transformed",
"ProjectId=''",
"ClientEmail=''",
"ClientId=''",
"ClientX509CertUrl=''",
"PrivateKeyId=''",
"PrivateKey=''",
]
[default.build.parameters]
cached = true
parallel = true
skip-pull-image = true
[default.deploy.parameters]
resolve_image_repos = true
resolve_s3 = true
capabilities = [
"CAPABILITY_NAMED_IAM"
]
tags = [
"Project=''",
"Email=''",
"Owner='Aashish Gaba'",
"Service=''"
]
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: >
bp-llm-template
Sample SAM Template for bp-llm-template
Parameters:
ProjectId:
Type: String
Description: Project Id
PrivateKeyId:
Type: String
Description: Google Service Account Private Key ID
PrivateKey:
Type: String
Description: Google Service Account Private Key
ClientEmail:
Type: String
Description: Google Client Email
ClientId:
Type: String
Description: Google Client Id
ClientX509CertUrl:
Type: String
Description: Certificate URL
S3Bucket:
Type: String
Description: S3 bucket that lambdas should have access to for read and write
RawDocS3KeyPrefix:
Type: String
Description: Raw S3 docs will be stored with this as prefix
TransformedDocS3KeyPrefix:
Type: String
Description: Raw S3 docs will be stored with this as prefix
ToolkitProjectName:
Type: String
Description: Toolkit Project Name
ToolkitProjectId:
Type: String
Description: Toolkit Project Name
ToolkitProjectVersion:
Type: String
Description: Toolkit Project Name
Globals:
Function:
Environment:
Variables:
toolkit_project_name: !Ref ToolkitProjectName
toolkit_project_id: !Ref ToolkitProjectId
toolkit_project_version: !Ref ToolkitProjectVersion
Resources:
GDriveToS3WithFormatConversionStepFunction:
Type: AWS::Serverless::StateMachine
Properties:
Name: !Sub 'GDriveToS3FormatConvSF${S3Bucket}'
DefinitionUri: ../runtime/statemachine/ingestion/state_machine.asl.json
DefinitionSubstitutions:
CrawlGoogleDriveFolderFunctionArn: !GetAtt CrawlGoogleDriveFolder.Arn
DownloadGoogleDriveFunctionArn: !GetAtt DownloadGoogleDriveFunction.Arn
ConvertToASCIIFunctionArn: !GetAtt ConvertToASCIIFunction.Arn
Policies:
- LambdaInvokePolicy:
FunctionName: !Ref CrawlGoogleDriveFolder
- LambdaInvokePolicy:
FunctionName: !Ref DownloadGoogleDriveFunction
- LambdaInvokePolicy:
FunctionName: !Ref ConvertToASCIIFunction
- StepFunctionsExecutionPolicy:
StateMachineName: !Sub 'GDriveToS3FormatConvSF${S3Bucket}'
StorageS3Bucket:
Type: 'AWS::S3::Bucket'
DeletionPolicy: Retain
Properties:
BucketName: !Ref S3Bucket
LambdaRoleReadWriteToBucket:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Sub 'LLMToolkitSFLambdaRole${S3Bucket}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
Path: /
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
Policies:
- PolicyName: S3ReadWriteAccess
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- 's3:*Object'
Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
- Effect: Allow
Action:
- 's3:ListBucket'
Resource: !Sub 'arn:aws:s3:::${S3Bucket}'
CrawlGoogleDriveFolder:
Type: AWS::Serverless::Function
Properties:
CodeUri: ../runtime/functions/crawl_gdrive_folder/
Architectures:
- arm64
Handler: index.lambda_handler
Runtime: python3.8
Timeout: 180
Role: !GetAtt LambdaRoleReadWriteToBucket.Arn
Environment:
Variables:
project_id: !Ref ProjectId
private_key_id: !Ref PrivateKeyId
private_key: !Ref PrivateKey
client_email: !Ref ClientEmail
client_id: !Ref ClientId
client_x509_cert_url: !Ref ClientX509CertUrl
Layers:
- !Ref UtilsLayer
DownloadGoogleDriveFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: ../runtime/functions/download_doc_from_gdrive/
Architectures:
- arm64
Handler: index.lambda_handler
Runtime: python3.8
Timeout: 180
Role: !GetAtt LambdaRoleReadWriteToBucket.Arn
Environment:
Variables:
project_id: !Ref ProjectId
private_key_id: !Ref PrivateKeyId
private_key: !Ref PrivateKey
client_email: !Ref ClientEmail
client_id: !Ref ClientId
client_x509_cert_url: !Ref ClientX509CertUrl
s3_bucket: !Ref S3Bucket
raw_doc_s3_key_prefix: !Ref RawDocS3KeyPrefix
transformed_doc_s3_key_prefix: !Ref TransformedDocS3KeyPrefix
Layers:
- !Ref UtilsLayer
ConvertToASCIIFunction:
Type: AWS::Serverless::Function
Properties:
Architectures:
- arm64
PackageType: Image
Timeout: 180
Role: !GetAtt LambdaRoleReadWriteToBucket.Arn
Environment:
Variables:
s3_bucket: !Ref S3Bucket
raw_doc_s3_key_prefix: !Ref RawDocS3KeyPrefix
transformed_doc_s3_key_prefix: !Ref TransformedDocS3KeyPrefix
Metadata:
Dockerfile: convert.Dockerfile
DockerContext: ../runtime
UtilsLayer:
Type: 'AWS::Serverless::LayerVersion'
Properties:
CompatibleRuntimes:
- python3.8
ContentUri: ../runtime/utils-layer/
Description: Utility layer for shared code
Outputs:
GDriveToS3WithFormatConversionStepFunctionArn:
Description: "Stock Trading State machine ARN"
Value: !Ref GDriveToS3WithFormatConversionStepFunction
GDriveToS3WithFormatConversionStepFunctionRoleArn:
Description: "IAM Role created for Stock Trading State machine based on the specified SAM Policy Templates"
Value: !GetAtt GDriveToS3WithFormatConversionStepFunctionRole.Arn
I am not able to reproduce this problem. I used a similar template with the same samconfig.toml you sent before (I added dummy values for the missing parameters), but not able to reproduce this issue.
I do not think the issue is related to the Configuration loading, as the 2 logs statements are not logged in 2 different places.
we still need to investigate more to know the source of this problem.
Thanks for trying to reproduce it. This is weird, I've never experienced this much speed with samconfig.toml and in fact my colleagues have experienced even slower deployments.
In my case, the difference of 3 mins time is in between these logs
Using SAM template at <path>
Using config file: ./samconfig.toml
Is something critical happening between these 2 log lines?
Using config file
: github linkUsing SAM template at
: github link
Description:
Steps to reproduce:
parameter_overrides
section in the samconfig.toml--parameter-overrides
Observed result:
1. With samconfig.toml
2. With Command Arguments
Expected result:
Additional environment details (Ex: Windows, Mac, Amazon Linux etc)