Open carlmed00 opened 2 years ago
Hello! I do not have a solution for you; however, I would like to add that I have had a similar issue and found out what might be causing it. I am also running StringTie 2.2.1. I have experience successfully running StringTie (2.1.3b).
For clarity, I ran this pipeline (based on Pertea et al., 2016):
In the resulting files this is what I noticed that were different from my previous results from StringTie (2.1.3b):
gene_counts_matrix.csv: -There were a lot more 0 values for genes than when using StringTie (2.1.3b). -Also, the last row on the spreadsheet there is a "<class 'str'>" gene name that has high values.
transcript_counts_matrix.csv: -There are many samples that have absent cells for some transcript rows but have values for other samples. Meaning, there is not any place holder value there. There is not even a 0.
I attempted to use Python2.7 and prepDE.py as well and was returned with your error. Upon going through all of the previous output files the error message is indeed true. When accessing a sample's merged.gtf output file (the one that contains the gene_id, transcript_id, FPKM and TPM values) certain samples do not have all of the transcript names. Meaning, some of my sample merged.gtf files have some transcript IDs, and some do not have those transcript IDs.
There may be something not happening correctly in the merging of transcripts from all samples step, but I am not certain.
Update: I ran the same reads and parameters with Stringtie 2.1.3b and there were no issues. I looked into other forums, and it seems like similar issues are occurring with this specific version. Maybe @gpertea would be able to help.
I am using v2.2.1 but encounter a similar error.
I tried to recheck and made sure that I had -e on all my generated files but still same error.
Originally posted by @carlmed00 in https://github.com/gpertea/stringtie/issues/337#issuecomment-1281851293