box / box-python-sdk-gen

Repository for generated Box Python SDK
Apache License 2.0
28 stars 5 forks source link

SDK to SDK-Gen Migration Help #341

Open rleary90 opened 1 week ago

rleary90 commented 1 week ago

Description of the Issue

What is the box-python-sdk-gen equivalent of boxsdk.object.file.File

Steps to Reproduce

Here is the code base I'm attempting to migrate from box-python-sdk to box-python-sdk

Get the folder

folder = client.folder(folder_id).get()

Initialize an empty list to store file items

file_items = []

Iterate over items in the folder

for item in folder.get_items():

Check if the item is a file

if isinstance(item, boxsdk.object.file.File):
    # Get the file name and extension
    file_name, file_extension = os.path.splitext(item.name)
    # Check if the file extension is .xlsx, .xls, or .csv
    if file_extension.lower() in ['.xlsx', '.xlsm','.xls', '.csv']:
        # Add the file item to the list
        file_items.append(item)

Get metadata for files in the specified folder

folder_items = client.folder(folder_id).get_items()

Initialize list to store combined data

combined_data = []

Iterate over items in the folder

for file_item in folder_items:

Check if the item is a file and has one of the specified extensions

if isinstance(file_item, boxsdk.object.file.File):
    file_extension = file_item.name.split('.')[-1].lower()
    if file_extension in ['xlsx', 'xlsm', 'xls', 'csv']:
        # Download the file content
        try:
            file_content = client.file(file_item.id).content()
        except Exception as e:
            print(f"Error downloading file: {file_item.name}")
            print(e)
            continue

        # Read General Information tabs from all files in target directory as files_df dataframe
        try:
            df = pd.read_excel(file_content, sheet_name="01 - Chapter Overview", skiprows=1)
            df = df.drop('Unnamed: 0', axis=1)
            df = df.iloc[:, :2]
            df = df.transpose()
            df = df.reset_index()
            df.columns = df.iloc[0]
            df = df.drop(0)
            # Drop columns by column number
            columns_to_drop = [0, 1, 2, 3, 4, 8, 9, 23, 24, 31, 32, 33, 34]
            df = df.drop(df.columns[columns_to_drop], axis=1)
            df = df.reset_index(drop=True)

            # Add metadata as additional columns in DataFrame
            df['File Name'] = file_item.name
            df['File ID'] = file_item.id

            # Get file metadata
            file_info = client.file(file_item.id).get()
            df['Created At'] = file_info.created_at
            df['Modified At'] = file_info.modified_at
            df['File Creator'] = file_info.created_by.name

            # Getting shared link
            shared_link = client.file(file_item.id).get().shared_link
            url = shared_link['url']
            df['File URL'] = url

            # Append DataFrame to combined data list
            combined_data.append(df)
        except Exception as e:
            print(f"Error processing file: {file_item.name}")
            print(e)

if combined_data:

Concatenate DataFrames

chapter_overview_df = pd.concat(combined_data, ignore_index=True)

Expected Behavior

combined data frame pulling from same tab in of all speadsheet files where tab exists.

Error Message, Including Stack Trace

Screenshots

Versions Used

Python SDK: python-sdk-gen Python: python 3.8

mwwoda commented 5 days ago

In general you could try to use our docs which shows examples of usage of most of the functions in SDK. If you are interested in more complex scenarios, you could also take a look at our integrations tests .

To give examples of few of the function calls you provided: