For which schema is a change/update being suggested?
I would like to request an update to the analysis_file.jsonanalysis_protocol.json schema.
What should the change/update be?
I would like to add two new fields
alignment_software to allow data contributors to record the software used for the alignment of the FASTQ files' reads to a reference genome
alignment_software_version to allow data contributors to record the version of the software used for the alignment of the FASTQ files' reads to a reference genome
This update constitutes a major change to the schema(s) it affects.
What new field(s) need to be changed/added?
* Field name: alignment_software
* Field description: Name of alignment software used to map the FASTQ files to the reference genome
* Field type: string
* Required: yes
* Examples: cell ranger; kallisto bustools; GSNAP; STAR; Not Applicable
* CV or enum: no
* Field name: alignment_software_version
* Field description: Version of alignment software used to map the FASTQ files to the reference genome
* Field type: string
* Required: yes
* Examples: v1.0.1; 2.4.2a; Not Applicable
* CV or enum: no
Why is the change requested?
This field is needed to record the specific software used in order to align fastq files to a reference genome.
Is one of the upcoming Tier 1 metadata proposed from HCA Integration Teams & Bionetwork
Affects which cells are filtered per dataset, and which reads (introns and exons or only exons) are counted as part of the reported transcriptome. This can convey batch effects.
For which schema is a change/update being suggested? I would like to request an update to the
analysis_file.json
analysis_protocol.json
schema.What should the change/update be? I would like to add two new fields
alignment_software
to allow data contributors to record the software used for the alignment of the FASTQ files' reads to a reference genomealignment_software_version
to allow data contributors to record the version of the software used for the alignment of the FASTQ files' reads to a reference genome This update constitutes a major change to the schema(s) it affects.What new field(s) need to be changed/added?
Why is the change requested?
This field is needed to record the specific software used in order to align fastq files to a reference genome. Is one of the upcoming Tier 1 metadata proposed from HCA Integration Teams & Bionetwork