RNAcentral / rnacentral-webcode

RNAcentral website source code
https://rnacentral.org
Apache License 2.0
31 stars 8 forks source link

Interrupted download #620

Open Altair6088 opened 1 year ago

Altair6088 commented 1 year ago

Hello, I'm currently trying to retreive all human lncRNA sequences via text search through here: https://rnacentral.org/search?q=RNA%20AND%20so_rna_type_name:%22NcRNA%22%20AND%20TAXONOMY:%229606%22%20AND%20so_rna_type_name:%22LncRNA%22

The site retrieved them all right, but I'm trying to download the fast file results and the download keeps getting interrupted. Even though I downloaded the mouse sequences just fine. Don't know if it's a file size issue. Help please.

asd1864714 commented 8 months ago

I also have the same problem. The number of downloads is inconsistent with the number retrieved by the website. The same downloaded json or RNAcentral ids format files will be less than the number retrieved.

carlosribas commented 8 months ago

Hi @asd1864714,

I think we are talking about two different problems. This ticket was created to report a problem downloading the file. This problem can unfortunately still occur, especially for files with a lot of entries.

Another problem is related to the number of entries available in the file VS the number of entries described in Text Search. As far as I can see, our Text Search has some duplicate IDs (same sequence listed twice).

The file has the correct number of unique sequences and we will work to fix the results in Text Search.

Kind regards,

Carlos