NFDI4BIOIMAGE / training

https://nfdi4bioimage.github.io/training
Creative Commons Attribution 4.0 International
10 stars 7 forks source link

Highlights of Download stats on main page. #285

Closed haesleinhuepf closed 2 weeks ago

haesleinhuepf commented 1 month ago

It would be great if we had a script that determines the recently most downloaded zenodo records, similar to this notebook: and then updates the website main page. For example, it could replace a placeholder such as {most_downloaded} with some actual content.

Optional: It could download the most downloaded thing, turn it into a PNG and show it on the main page. This will not work with all contents obviously, but with PPTx or PDFs.

git-bob comment: Given a zenodo record url, how would you automatically download the first file from this zenodo record using the zenodo API and if its a PPT or PDF turn the first slide or page into a PNG?

github-actions[bot] commented 1 month ago

This message was generated by git-bob (version: 0.9.0, model: github_models:gpt-4o, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To download the first file from a Zenodo record and convert it to a PNG if it's a PPT or PDF:

import requests
from pptx import Presentation
from pdf2image import convert_from_path
from io import BytesIO
from PIL import Image

def download_first_file_from_zenodo(record_id):
    # Fetch record metadata
    url = f"https://zenodo.org/api/records/{record_id}"
    response = requests.get(url)
    response.raise_for_status()
    data = response.json()

    # Get the first file download link
    file_url = data['files'][0]['links']['self']
    file_type = data['files'][0]['type']

    # Download the file
    response = requests.get(file_url)
    response.raise_for_status()
    file_content = BytesIO(response.content)

    # Check if the file is PPT or PDF and convert
    if file_type == 'application/vnd.openxmlformats-officedocument.presentationml.presentation':
        prs = Presentation(file_content)
        slide = prs.slides[0]
        image = BytesIO()
        slide.shapes[0].image.export(image, 'PNG')
        image.seek(0)
        img = Image.open(image)

    elif file_type == 'application/pdf':
        pages = convert_from_path(file_content)
        img = pages[0]

    # Save the first page or slide as a PNG
    img.save('first_page.png', 'PNG')

# Example usage
download_first_file_from_zenodo('1234567')

Replace '1234567' with the actual Zenodo record ID.

haesleinhuepf commented 1 month ago

Hi @marabuuu ,

as discussed yesterday, could you please check if this works or can be made to work? (-:

Edit: Feel free to let git-bob do the hard part of the job :-)

Thanks!

Best, Robert

marabuuu commented 1 month ago

Yes, I am happy to look into this :)