STOmics / SAW

GNU General Public License v3.0
145 stars 34 forks source link

image registration failed #128

Open qiuyixmm opened 4 months ago

qiuyixmm commented 4 months ago

Hello, The SAW (version: 7.0) pipeline stopped at the step of image registration and threw an error as shown below:

image

Please help me at your convenience. Thanks!

Clouate commented 4 months ago

Hi, this may be because your 02.count/SN.raw.gef, that is, the gene expression matrix file is incomplete. You could confirm whether the SAW count step was completed successfully, or upload all the log files to us.

qiuyixmm commented 4 months ago

Yes, you are right! There are no "transcript" terms in the 3rd column of my annotation file in GTF format. For that, I use a new annotation file in GFF format, which have "transcript" and "mRNA" terms in its 3rd column, like this:

image

I have some small questions:

  1. For the counting process, the reads that are mapped to exons of both "transcript" and "mRNA" terms would be calculated when we use a GFF file as an annotation file?
  2. I notice that some genes only have "exon" terms, and don't include "transcript" and "mRNA" terms. The read counts for these genes would be reported in the final gene expression matrix?
  3. For the annotation file derived from ensemble database, the gene_id (ENSGALG00000016906) or gene_name(SPRY2) would be used in the final gene expression matrix?
image

Thanks!

Clouate commented 4 months ago

Hi, I hope the following answers could help you 1) The reads mapped to exons of "mRNA" terms would be calculated 2) If a gene is expected to be annotated successfully, the corresponding 'gene', 'mRNA', and 'exon' line information should be included in the GFF file. And their affiliation needs to be clear (through the 'Parent' field) as shown in the following image. In addition, you could use SAW checkGTF to check whether the format is correct. a08d131a4ad9f32f386b8ecaa0590a0 3) I noticed that you are using SAW v7.0, which only uses gene names in the final gene expression matrix. In the latest SAW v8.0 (https://www.stomics.tech/products/BioinfoTools/OfflineSoftware), the expression matrix will include gene name and gene id.

qiuyixmm commented 4 months ago

It really helps me a lot. Thank you!