saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
307 stars 50 forks source link

[ENH] library_layout #54

Closed Maarten-vd-Sande closed 4 years ago

Maarten-vd-Sande commented 4 years ago

First off, thanks for the awesome tool!

Is your feature request related to a problem? Please describe. In the examples on the docs and notebooks the returned dataframe has a library_layout column, however I do not get this column newest version + SRAweb.

library_layout

import pysradb
print(pysradb.__version__)
print(SRAweb().sra_metadata("SRP016501", detailed=True).columns)

> 0.11.0
>Index(['study_accession', 'experiment_accession', 'experiment_title',
       'experiment_desc', 'organism_taxid ', 'organism_name',
       'library_strategy', 'library_source', 'library_selection',
       'sample_accession', 'sample_title', 'instrument', 'total_spots',
       'total_size', 'run_accession', 'run_total_spots', 'run_total_bases',
       'run_alias', 'sra_url_alt1', 'sra_url_alt2', 'sra_url',
       'experiment_alias', 'source_name', 'tissue', 'sra_url_alt3', 'strain',
       'ena_fastq_http_1', 'ena_fastq_http_2', 'ena_fastq_ftp_1',
       'ena_fastq_ftp_2'],
      dtype='object')

I scanned the source very briefly but it seems like this info is never used: https://github.com/saketkc/pysradb/blob/master/pysradb/sraweb.py#L416

Is there a reason its not there?

saketkc commented 4 years ago

Thanks @Maarten-vd-Sande for the bug report, it was indeed not being exported. The latest commit should fix it. You can try installing the master branch:

pip install git+https://github.com/saketkc/pysradb.git