Closed chrisquince closed 5 years ago
Looking at dependencies, I realized a new cogdb was available and updated it but forgot to update the cdd_to_cog file. I will regenerate that file at some point this week. It's easy, the info is at : ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/cddid.tbl.gz But in the mean time it is possible to run with previous cog database, as I did not delete it but moved it, you can just change the config file and use : /mnt/gpfs/seb/Database/rpsblast_cog_db/old_cogs
OK Seb we can close this when the new cdd file is uploaded. More care needs to be taken though to avoid these issues as this has put the analysis back by a week and wasted yesterday afternoon for me. The test example probably would not help here but error checking in the scripts would have (I have added a small bit).
I'm sorry I made you lost a lot of time. I tend to do mutliple things at the same time which make me error prone. This is a typical example where I change something without going through testing/or looking for what are other linked things which should also be changed. I will improve that. I just updated the cdd_to_cog file, so it should work now.
So the current version seems to be using incompatible cog databases. For example the entry 'gnl|CDD|319244' can be returned as a rpsblast hit but this is not present in cdd_to_cog.tsv:
grep "319244" /mnt/gpfs/Hackathon/FMTMeren/STRONG/COG_pipe/scg_data/cdd_to_cog.tsv
This causes pipeline to crash. I will do a temporary fix by actually adding some error checking to ./Filter_Cogs.py but we need to work out why we have incompatible results.