huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.28k stars 2.7k forks source link

Huggingface GIT returns null as Content-Type instead of application/x-git-receive-pack-result #7225

Open padmalcom opened 1 month ago

padmalcom commented 1 month ago

Describe the bug

We push changes to our datasets programmatically. Our git client jGit reports that the hf git server returns null as Content-Type after a push.

Steps to reproduce the bug

A basic kotlin application:

   val person = PersonIdent(
        "padmalcom",
        "padmalcom@sth.com"
    )

    val cp = UsernamePasswordCredentialsProvider(
        "padmalcom",
        "mysecrettoken"
    )

    val git =
        KGit.cloneRepository {
            setURI("https://huggingface.co/datasets/sth/images")
            setTimeout(60)
            setProgressMonitor(TextProgressMonitor())
            setCredentialsProvider(cp)
        }

    FileOutputStream("./images/images.csv").apply { writeCsv(images) }
    git.add {
        addFilepattern("images.csv")
    }

    for (i in images) {
        FileUtils.copyFile(
            File("./files/${i.id}"),
            File("./images/${i.id + File(i.fileName).extension }")
        )
        git.add {
            addFilepattern("${i.id + File(i.fileName).extension }")
        }
    }

    val revCommit = git.commit {
        author = person
        message = "Uploading images at " + LocalDateTime.now()
            .format(DateTimeFormatter.ISO_DATE_TIME)
        setCredentialsProvider(cp)
    }

    val push = git.push {
        setCredentialsProvider(cp)
    }

Expected behavior

The git server is expected to return the Content-Type application/x-git-receive-pack-result.

Environment info

It is independent from the datasets library.