agrc / palletjack

A library for updating AGOL data from various external sources
MIT License
12 stars 0 forks source link

Return full failsafe path programmatically as part of truncate and load #43

Open jacobdadams opened 1 year ago

jacobdadams commented 1 year ago

Cloud functions can't mount cloud storage as a file system, so we'd have to pass a temp directory as the failsafe_dir, get the full path programmatically, and copy that over to a bucket.

steveoh commented 1 year ago
def save_feature_layer_to_json(feature_layer, directory):
    """Save a feature_layer to directory for safety as {layer name}_{todays date}.json
    Args:
        feature_layer (arcgis.features.FeatureLayer): The FeatureLayer object to save to disk.
        directory (str or Path): The directory to save the data to.
    Returns:
        Path: The full path to the output file, named with the layer name and today's date.
    """

The cloud storage bucket or blob name could be passed here and the dataframe json could be written directly to a storage bucket if the directory follows a naming convention. It's redundant to write data to disk only to then read it and write the same data to cloud storage.

Here's an example using the parquet format from pandas being uploaded to cloud storage in memory.

https://github.com/agrc/udot-parcel-ml/blob/af3c04e142af74520e2e481c1b8f31769fc3cc41/row.py#L749-L752