STOmics / SAW

GNU General Public License v3.0
145 stars 34 forks source link

Stomics raw.gef file empty? #131

Closed kf-cuanschutz closed 4 months ago

kf-cuanschutz commented 4 months ago

Hi,

I am running the Stomics container 0.7.1.0 and I am getting an exit status 1 at the registration step. The command being timed was:

"apptainer exec /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/SAW_v7.1.sif register -i /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/images/A03678D
6/A03678D6_20240605_114628/A03678D6_SC_20240605_114628_3.0.4.tar.gz -c /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/images/A03678D6/A03678D6_20240605_114628/A03678D6_SC_20240605_114628_3.0.4.i
pr -v /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/outs/D6/02.count/A03678D6.raw.gef -w True --core 32 -o /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/outs/D6/03.register"

This is likely because I got an "Exception: SAW-A40004" referring to my A03678D6.raw.gef file .

Do you guys know why the previous count step could have created a corrupted .raw.gef file?

[ERRO 20240711-01-37-09 p916731 load_gef matrixloader.py:137] SAW-A40004: The sequencing data is empty, please confirm the /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/outs/D6/02.count/A03678D
6.raw.gef file.
E0711 01:37:09.377397 916731 matrixloader.py:137] SAW-A40004: The sequencing data is empty, please confirm the /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/outs/D6/02.count/A03678D6.raw.gef fi
le.
Traceback (most recent call last):
  File "register/register-v4.3.1/register/main.py", line 539, in <module>
  File "register/register-v4.3.1/register/main.py", line 535, in main
  File "register/register-v4.3.1/register/main.py", line 356, in __init__
  File "register/register-v4.3.1/register/main.py", line 502, in run
  File "register/register-v4.3.1/register/registration/registration.py", line 99, in registration
  File "register/register-v4.3.1/register/utils/matrixloader.py", line 36, in load
  File "register/register-v4.3.1/register/utils/matrixloader.py", line 138, in load_gef
Exception: SAW-A40004: The sequencing data is empty, please confirm the /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/outs/D6/02.count/A03678D6.raw.gef file.
Command exited with non-zero status 1
kf-cuanschutz commented 4 months ago

Indeed, checking the gef file, it looks empty. I used the script from here: https://stereopy.readthedocs.io/en/latest/Tutorials/IO.html

[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1371][INFO]: This is GEF file which contains traditional bin infomation.
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1372][INFO]: bin_type: bins
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1375][INFO]: Bin size list: ['bin1']
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1381][INFO]: Resolution: 500
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1384][INFO]: Gene count: 0
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1393][INFO]: offsetX: 4294967295
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1396][INFO]: offsetY: 4294967295
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1399][INFO]: Width: 1
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1402][INFO]: Height: 1
[2024-07-11 12:05:24][Stereo][3635475][MainThread][140327550932800][reader][1405][INFO]: Max Exp: 0
Clouate commented 4 months ago

Hi, could you could confirm whether the SAW count step was completed successfully, or upload all the log files to us. In addition, you could use SAW checkGTF to check whether the gtf/gff file format is correct, which usually causes the gene annotation to fail.

kf-cuanschutz commented 4 months ago

Hi, thank you for the help! It looks like SAW checkGTF does not output anything. Is that normal? Below is the command I used:

singularity exec /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/SAW_v7.1.sif checkGTF -i $WORK_DIR/genome_ref/genes.tdTomato.gtf -o $WORK_DIR/genome_ref/genes.tdTomato_new.gtf
kf-cuanschutz commented 4 months ago

And I think you are right some items did not go as expected during the counting step. In the log file I can see a few " (null) num wrong",

Clouate commented 4 months ago

Hi, thank you for the help! It looks like SAW checkGTF does not output anything. Is that normal? Below is the command I used:

singularity exec /gpfs/alpine1/scratch/kfotso@xsede.org/_ticket_/STomics/SAW_v7.1.sif checkGTF -i $WORK_DIR/genome_ref/genes.tdTomato.gtf -o $WORK_DIR/genome_ref/genes.tdTomato_new.gtf

Hi, you could find the logs/checkGTF.*log file in the directory you ran the SAW checkGTF, and checked if the genes were filtered.

kf-cuanschutz commented 4 months ago

Thanks! I think that the file is clearly problematic. I got a "No valid gene found!". So that's clear. I think I am going to check back with the users I am helping. Thank you again! Closing this issue now.