kb_uploadmethods
Current Status
Branch |
Build |
Coverage |
LGTM Alerts |
master |
|
|
|
Release Notes
Description
This module implements KBase apps that are used to transform data files into KBase data objects for use in analysis. This also includes a few apps for adding external files to the user's staging area, and manipulating them there.
See here for more information on uploading data to KBase.
Development
This module was created using the KBase SDK. See the documentation for more detail. Apps are mostly broken down into submodules. Instructions for setting up a local development environment can be found here (note that Docker is required).
The main entrypoint is in lib/kb_uploadmethods/kb_uploadmethodsImpl.py
. Individual apps in that module use one or more utility module to handle different file types. These are all under lib/kb_uploadmethods/Utils
. Add a new Util module if you're adding an uploader for a new data type.
Testing
This module can be tested with the following steps, common to all KBase SDK modules.
- Install the KBase SDK.
- Fetch this module and navigate to it from the console.
- Run
kb-sdk test
once - this will initialize the test_local/test.cfg
file.
- Populate the
test_local/test.cfg
file with a KBase authentication token (see here for details).
- Run
kb-sdk test
again.
Once steps 1-4 have been run once, you can just run kb-sdk test
to run the test suite any time.
Apps
Each app is listed below with the following format:
Display name - what the user sees - links to module doc page, if available
app id: id - how the app is identified to the system, including the directory under ui/narrative/methods
entrypoint: function - name of the function that gets run in this module, as described in kb_uploadmethods.spec
output type: typed object(s) created
- app id: batch_import_assembly_from_staging
- entrypoint: batch_import_assemblies_from_staging
- description: Import FASTA files from your staging area into your Narrative as an Assembly data object.
- input file type: FASTA
- output type: KBaseSets.AssemblySet
- app id: batch_import_genome_from_staging
- entrypoint: batch_import_genomes_from_staging
- description:: Import files (GenBank or GFF + FASTA) from your staging area into your Narrative as a Genome data object
- input file types: GenBank, GFF with FASTA
- output type: KBaseSearch.GenomeSet
- app id: import_attribute_mapping_from_staging
- entrypoint: import_attribute_mapping_from_staging
- description: Import a TSV or Excel file from your staging area into your Narrative as an Attribute Mapping data object.
- input file types: TSV, Excel
- output type: KBaseExperiments.AttributeMapping
- app id: import_eschermap_from_staging
- entrypoint: import_eschermap_from_staging
- description: Import a JSON file from your staging area into your Narrative as an KBaseFBA.EscherMap data object.
- input file type: JSON (format not specified)
- output type: KBaseFBA.EscherMap
- app id: import_fasta_as_assembly_from_staging
- entrypoint: import_fasta_as_assembly_from_staging
- description: Import a FASTA file from your staging area into your Narrative as an Assembly data object.
- input file type: FASTA
- output type: KBaseGenomeAnnotations.Assembly
- app id: import_fasta_as_seqset_from_staging
- entrypoint: import_fasta_as_seqset_from_staging
- description: Import a FASTA file from your staging area into your Narrative as a Protein/DNA Sequence Set data object
- input file type: FASTA
- output type: KBaseSequences.DNASequenceSet or KBaseSequences.ProteinSequenceSet
- app id: import_fastq_sra_as_reads_from_staging
- entrypoint: import_reads_from_staging
- description: Import a FASTQ or SRA file into your Narrative as a Reads data object.
- input file types: FASTQ, SRA
- output type: KBaseFile.SingleEndLibrary or KBaseFile.PairedEndLibrary
- app id: import_file_as_fba_model_from_staging
- entrypoint: import_file_as_fba_model_from_staging
- description: Import a file in TSV, XLS (Excel) or SBML format from your staging area into your Narrative as an FBAModel.
- input file types: TSV, Excel, SBML
- output type: KBaseFBA.FBAModel
- app id: import_genbank_as_genome_from_staging
- entrypoint: import_genbank_from_staging
- description: Import a GenBank file from your staging area into your Narrative as a Genome data object.
- input file types: GenBank
- output type: KBaseGenomes.Genome
- app id: import_gff_fasta_as_genome_from_staging
- entrypoint: upload_fasta_gff_file
- description: Import a GFF or FASTA file from your staging area into your Narrative as a Genome data object.
- input file types: GFF3 and FASTA
- output type: KBaseGenomes.Genome
- app id: import_gff_fasta_as_metagenome_from_staging
- entrypoint: upload_metagenome_fasta_gff_file
- description: Import a GFF or FASTA file from your staging area into your Narrative as an annotated metagenome data object.
- input file types: GFF3 and FASTA
- output type: KBaseMetagenomes.AnnotatedMetagenomeAssembly
- app id: import_sra_as_reads_from_web
- entrypoint: import_sra_from_web
- description: This App allows the user to load SRA format read libraries directly into the workspace from sources on the web. In addition to standard HTTP and anonymous FTP links, the user may also obtain files from Google drive and Dropbox links.
- input file type: SRA
- output type: KBaseFile.SingleEndLibrary or KBaseFile.PairedEndLibrary
- app id: import_tsv_as_expression_matrix_from_staging
- entrypoint: import_tsv_as_expression_matrix_from_staging
- description: Import a tab-delimited file from your staging area into your Narrative as an Expression Matrix.
- input file type: TSV
- output type: KBaseFeatureValues.ExpressionMatrix
- app id: import_tsv_as_phenotype_set_from_staging
- entrypoint: import_tsv_as_phenotype_set_from_staging
- description: Import a tab-delimited file in your staging area as a Phenotype Set.
- input file type: TSV
- output type: KBasePhenotypes.PhenotypeSet
- app id: import_tsv_excel_as_media_from_staging
- entrypoint: import_tsv_or_excel_as_media_from_staging
- description: Import Media file (TSV/Excel) from your staging area.
- input file types: TSV, Excel
- output type: KBaseBiochem.Media
- app id: load_paired_end_reads_from_URL
- entrypoint: upload_fastq_file
- description: This App allows users to load FASTQ format paired-end read libraries directly into the workspace from sources on the web. In addition to standard HTTP and anonymous FTP links, the user may also obtain files from Google drive and Dropbox links.
- input file types: FASTQ, FASTA
- output type: KBaseFile.PairedEndLibrary
- app id: load_single_end_reads_from_URL
- entrypoint: upload_fastq_file
- description: This App allows users to load FASTQ format single-end read libraries directly into the workspace from sources on the web. In addition to standard HTTP and anonymous FTP links, the user may also obtain files from Google drive and Dropbox links.
- input file types: FASTQ, FASTA
- output type: KBaseFile.SingleEndLibrary
- app id: unpack_staging_file
- entrypoint: unpack_staging_file
- description: This App allows users to unpack a compressed file in the staging area. Recognizable compressed files include .zip, .gz, .bz2, .tar, .tar.gz, and .tar.bz2.
- input file types: any compressed file
- output type: none, this creates one or more new files in the staging area
- app id: upload_web_file
- entrypoint: unpack_web_file
- description: This App allows users to upload a data file (which may be compressed) from a web URL to the staging area. If the file is compressed (.gz or .zip), it will automatically be uncompressed. It is possible, and indeed encouraged to make use of folders when uploading compressed archives of files. These folders are leveraged by downstream batch processing Apps and enable users to run tools on every file in the folder. We strongly recommend using this method to move large amounts of data easily into KBase because the transfer mechanism is less likely to be interrupted (versus uploading directly from your laptop). Note that both Box and DropBox offer a mechanism to share private files temporarily using links that are only accessible to someone who know what the link address is.
- input file types: any
- output type: none, adds one or more files to the staging area
- Previously "Import FASTQ/SRA File as Reads from Staging Area" app
- app id: import_sra_as_reads_from_staging
- entrypoint: import_sra_from_staging
- description: Import an SRA file from your staging area into your Narrative as a Reads object.
- input file type: SRA
- output type: KBaseFile.SingleEndLibrary or KBaseFile.PairedEndLibrary
- Previously "Import FASTQ/SRA File as Reads from Staging Area" app
- app id: load_paired_end_reads_from_file
- entrypoint: import_fastq_interleaved_as_reads_from_staging
- description: Import a FASTQ file as interleaved reads from your staging area.
- input file types: FASTQ
- output type: KBaseFile.PairedEndLibrary
- Previously the "Import FASTQ/SRA File as Reads from Staging Area" app
- app id: load_single_end_reads_from_file
- entrypoint: import_fastq_noninterleaved_as_reads_from_staging
- description: Import a FASTQ file as non interleaved reads from your staging area.
- input file types: FASTQ
- output type: KBaseFile.SingleEndLibrary
- replaced by the "Import FASTQ/SRA File as Reads from Staging Area" app
- app id: load_paired_end_reads_from_file
- entrypoint: upload_fastq_file
- description: Import FASTA or FASTQ files as paired-end reads from your staging area.
- input file types: FASTA, FASTQ
- output type: KBaseFile.PairedEndLibrary
- replaced by the "Import FASTQ/SRA File as Reads from Staging Area" app
- app id: load_single_end_reads_from_file
- entrypoint: upload_fastq_file
- description: Upload a single-end reads library from a FASTQ or FASTA file in your staging area.
- input file types: FASTA, FASTQ
- output type: KBaseFile.SingleEndLibrary