This repository provides a 'converter' lambda function. It converts CSV data from Autodesk ACC (or BIM360) into Parquet format and stores it in your own S3 bucket. This allows you to efficiently generate reports in AWS QuickSight from your data stored in Parquet format.
There are 2 other helper functions that kick off the ACC Data Connector process, wait for the CSV to become ready and then trigger the 'converter' lambda function.
Users of Autodesk Construction Cloud (ACC) or BIM360 who want to leverage the reporting power of AWS QuickSight need their data in a format suitable for QuickSight consumption. Parquet files provide an efficient, compressed format ideal for this purpose. This Lambda function automates the process of converting CSV files to Parquet, ensuring that your data is ready for QuickSight.
This project provides an AWS Lambda function that:
Use the 'create-a-weekly-schedule.py' script to set up a new scheduled job, and configure the 'handler-callback.py' to listen for the 'CSV's are ready' callback event.
See Reference documentation, to see how 'handler-callback.py' retrieves the individual signURL for a CSV file from BIM360 Data-Connector API
Serverless computing is a cloud computing execution model in which the cloud provider runs the server, dynamically managing the allocation of machine resources. In this model, you can build and run applications without managing infrastructure, scaling automatically based on demand. AWS Lambda is one such service that allows you to execute code without provisioning or managing servers.
Here is a high-level overview of how the solution works:
Data Source
: ACC or BIM360 generates CSV files.Signed URL
: The CSV file is accessed via a signed URLEvent
: This is a webhook event, that comes from ACC/BIM360 when a CSV file changes. The event contains a signed-URL (of the 'source' CSV ) and a filename (the 'destination' parquet file) AWS Lambda
: The Lambda function processes the CSV and converts it to Parquet format.S3 Bucket
: The resulting Parquet file is stored in your own S3 bucket.AWS QuickSight
: QuickSight generates reports from the Parquet files.see step 2 below
)see step 3 below
)You need to create a Lambda Layer that includes DuckDB as a dependency.
A. Create a directory for the dependencies, Install DuckDB into the directory, Package the layer:
mkdir duckdb_layer
cd duckdb_layer
mkdir -p python
pip install --target ./python duckdb
zip -r9 duckdb_layer.zip python
Go to the AWS Lambda console and configure the following environment variables:
C. Example Usage
aws lambda invoke \
--function-name your-lambda-function-name \
--payload '{"source_url": "https://signed-url-to-csv-file", "destination_filename": "output.parquet"}' \
response.json
create-a-weekly-schedule.py
PURPOSE:
Call this URL endpoint, to schedule a Data Connector API dump of CSV files on a once off basis.
Remember to configure the callback to point to the 'handler-callback.py'
handler-callback.py
PURPOSE:
INPUTS:
See above
This setup helps you automatically convert Autodesk ACC (or BIM360) data into a format suitable for AWS QuickSight reporting. By leveraging serverless infrastructure with AWS Lambda, this solution is both scalable and cost-effective.