KaniyamFoundation / ProjectIdeas

A Place to write down the project ideas and to plan them
40 stars 3 forks source link

Archive.org Bulk Upload #225

Open IngersolNorway opened 2 months ago

IngersolNorway commented 2 months ago

The present code (ia upload --spreadsheet upload.csv) for uploading multiple files using the metadata chart CSV, by linking stored file names in metadata, needs the following changes:

  1. Reduce Uploading Speed: Decrease the uploading speed to avoid triggering scam suspicion blocks from the archive.
  2. File Management:
    • Move uploaded files to a separate folder.
    • Move error files to a separate folder.
  3. Continuous Uploading: Ensure the uploading process does not stop if an error occurs. It should continue with the next file.
  4. Summary List: Provide a summary list that includes the file name, Excel name, and upload status (done/failed). This will make it easier to remove those lines from the Excel file for re-upload.
  5. File Naming: Add a ".pdf" extension to the end of file names and replace spaces with underscores automatically.
  6. Column Efficiency: Reduce the number of columns by avoiding duplication of the same data across multiple columns.

Can you help with these changes?

Regards,
Ingersol

tshrinivasan commented 2 months ago

https://github.com/KaniyamFoundation/internet-archive-bulk-upload

@IngersolNorway here is the code for bulk upload to internet archive

test this and share your thoughts.

IngersolNorway commented 2 months ago

Is it has all above like

Move uploaded files to a separate folder. Move error files to a separate folder. Ensure the uploading process does not stop if an error occurs. It should continue with the next file. Provide a summary list that includes the file name, Excel name, and upload status (done/failed).

tshrinivasan commented 1 month ago

@IngersolNorway

except this Add a ".pdf" extension to the end of file names

the code has all other mentioned features.