galaxyproject / training-material

A collection of Galaxy-related training material
https://training.galaxyproject.org
MIT License
297 stars 869 forks source link

Human Cell Atlas Matrix Downloader database problem #4567

Open wee-snufkin opened 7 months ago

wee-snufkin commented 7 months ago

When trying to use HCA Downloader, it outputs an error "the project identifier XXX was not found in the database" even though I tried using subsequently project title, project label and project UUID - all failed. It happened for three different projects I tried but when I tested the project that is given as an example ("Single cell transcriptome analysis of human pancreas"), it worked with this one only.

Another problem with this tool that I encountered was when I tested "Species to use" box, using "Single cell transcriptome analysis of human pancreas" project (because only this seemed to work). I typed "Homo sapiens" as HCA Atlas indicates, and this time the tool failed, with an error: usage: Download data via HCA DCP FTP. Requires -p input. Files are downloaded into current working directory. [-h] -p PROJECT [-f {loom,mtx}] [-o OUTPREFIX] [-s SPECIES] Download data via HCA DCP FTP. Requires -p input. Files are downloaded into current working directory.: error: unrecognized arguments: sapiens

Here is the history with all those datasets: https://usegalaxy.eu/u/j.jakiela/h/hca-data

nomadscientist commented 6 months ago

@pcm32 who should we direct this to?

pcm32 commented 6 months ago

There could be some backend change from the Human Cell Atlas team at the EBI, I would approach them so that the test the version of the cli used by this wrapper. You could also make a PR to the wrapper on containers galaxy SC repo to make sure that the text passed for species is wrapped in quotes.

tburdett commented 6 months ago

Hi @nomadscientist, thanks for calling this to my attention. I'm not sure if there has been a backend change that will have impacted the Galaxy downloader but I can do some digging. Can someone who knows the HCA downloader - maybe @pcm32 (hi Pablo!) - point me in the right direction? I'm looking for the mechanism that Galaxy uses to grab data from the HCA - even a pointer to the relevant bit of code would help me!