conJUSTover / pSONIC

The repository serves as a public and official hosting of the pSONIC program (Conover et al., 2021).
GNU General Public License v3.0
19 stars 3 forks source link

Error with MCScanX step #6

Closed ssamberkar closed 2 years ago

ssamberkar commented 2 years ago

Hi Justin,

Thanks for your explanation on using the GFF file.

However, for the MCScanX step, I got this error, supposedly while populating a dictionary (pardon my expertise in Python):

Traceback (most recent call last): File "/home/ssa18/tools/pSONIC/pSONIC.py", line 444, in <module> parse_args() File "/home/ssa18/tools/pSONIC/pSONIC.py", line 440, in parse_args else: translate(args.gff, args.prefix, args.sequenceIDs) File "/home/ssa18/tools/pSONIC/pSONIC.py", line 331, in translate line[1] = gene_code[line[1]] KeyError: 'GeneID'

The command I used:

python3 ~/tools/pSONIC/pSONIC.py test_gff translate_gff -t 16 -gff ./gff/rice_test.gff -sID ./fasta/protein/orthofinder_test/OrthoFinder/Results_Nov24/WorkingDirectory/SequenceIDs.txt

My GFF file snapshot:

` Sp## GeneID Start_POS End_POS az01 Osazucena_01g0000010.01 2760 10183 az01 Osazucena_01g0000020.01 10737 11801 az01 Osazucena_01g0000030.01 10738 11793 az01 Osazucena_01g0000040.01 12077 15281 az01 Osazucena_01g0000050.01 15658 17670 az01 Osazucena_01g0000060.01 22200 26348 az01 Osazucena_01g0000060.02 22207 26268 az01 Osazucena_01g0000070.01 26495 28061 az01 Osazucena_01g0000070.02 26495 28061

`

I've three rice cultivars in this GFF file. I'm sure it is some formatting issue as all the geneIDs have entries in their corresponding protein fasta files.

Let me know what I missed.

Best, Sandeep

conJUSTover commented 2 years ago

I think if you get rid of the header line in your gff file, it should work fine. Let me know if it doesn't, and I can dig in to why this may be occurring

ssamberkar commented 2 years ago

Ok, that worked but now I only get <PREFIX>.gff and not <PREFIX>.blast.

conJUSTover commented 2 years ago
.blast must be generated from an OrthoFinder run. In the Working Directory from the OrthoFinder results, there should be several .blast.gz (or equivalent) files. You can unzip and concatenate those together to generate the .blast file.
ssamberkar commented 2 years ago

Alright, worked fine till now, except the last step gave this error:

Starting pSONIC Traceback (most recent call last): File "/home/ssa18/tools/pSONIC/pSONIC.py", line 444, in <module> parse_args() File "/home/ssa18/tools/pSONIC/pSONIC.py", line 442, in parse_args main(args.prefix, args.orthogroups, args.threads, args.ploidy, args.sequenceIDs, args.speciesIDs) File "/home/ssa18/tools/pSONIC/pSONIC.py", line 363, in main with open(prefix + ".tandem", "r") as handle: FileNotFoundError: [Errno 2] No such file or directory: 'rice_test.tandem'

conJUSTover commented 2 years ago

The .tandem file should have been generated by MCScanX. Please ensure that the file prefix for the .tandem file is the same as for the .blast and .gff files.

ssamberkar commented 2 years ago

Done. It's working now. Thanks for your blazing fast responses.