BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
20 stars 11 forks source link

improve validation script to try to fail early with meaningful messages #56

Closed borauyar closed 2 years ago

borauyar commented 5 years ago
alexg9010 commented 5 years ago

relating to chromosome naming style checking you could have a look here: https://github.com/BIMSBbioinfo/pigx_chipseq/blob/master/scripts/Check_Config.py#L213-L267

borauyar commented 2 years ago

These commits fixes the issues about the annotation files: https://github.com/BIMSBbioinfo/pigx_rnaseq/commit/9036acee3a24edc7bd1545229c2b548a737b60d0, https://github.com/BIMSBbioinfo/pigx_rnaseq/commit/014571fa6b7457efe3e0288596859b5f5ad894c4, https://github.com/BIMSBbioinfo/pigx_rnaseq/commit/efdd4f29a07d327a2791138025742c1c8ad4f197, https://github.com/BIMSBbioinfo/pigx_rnaseq/commit/a7fb557517e84fb7ba34cfaf6edd537ad6db8f26

  Check if the settings.yaml is formatted correctly.
  Check to see if the input GTF file is parseable.
  Check transcript ids in cDNA file to see if they match transcript_id field in GTF file. Put a warning that the transcript -> gene id mapping won't work for salmon results.
  Check if chromosome naming conventions agree between the GTF file and the fasta file
  UCSC vs NCBI styles