donovan-h-parks / CompareM

A toolbox for comparative genomics.
GNU General Public License v3.0
96 stars 19 forks source link

can't find orthologs between genomes #6

Closed ceya closed 7 years ago

ceya commented 8 years ago

Hello,

I'm using the default setting of comparem aai_wf to calculate aai between my faa genome files and the program cannot find any orthologs. but with the AAI calculator by Kostas lab the score is 99. Does anyone know what might be the problem?

Thank you! Brooke

donovan-h-parks commented 8 years ago

Hello Brooke,

CompareM is still in active development, but I am very surprised to hear this result. Do you mind sending my your genome files and the exact command you are running so I can investigate?

Cheers, Donovan (donovan.parks [at] gmail.com)

ceya commented 8 years ago

Hello Donovan,

Of course, attached are two of my faa files. I used default setting comparem aai_wf test aai_output (test being my directory containing the faa files). I also tried adjusting the e-value and identity cutoff but it didn't help much. I'd really appreciate if you could have a look and help me identify what might be the issue.

Thank you very much! Brooke

On Tue, Jul 5, 2016 at 2:46 PM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

CompareM is still in active development, but I am very surprised to hear this result. Do you mind sending my your genome files and the exact command you are running so I can investigate?

Cheers, Donovan (donovan.parks [at] gmail.com)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-230597909, or mute the thread https://github.com/notifications/unsubscribe/AM5Zu05EcxiWuclO5I8SmQyDC5GG1tkoks5qSsKxgaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hello Brooke,

I don't see any attached files. Any chance you forgot to attach them? :)

Cheers, Donovan

ceya commented 8 years ago

Hi Donovan,

Can you see them now? I attached four faa files this time.

Regards, Brooke

On Mon, Jul 11, 2016 at 4:32 PM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

I don't see any attached files. Any chance you forgot to attach them? :)

Cheers, Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-231885664, or mute the thread https://github.com/notifications/unsubscribe/AM5Zu6NH-6SxXLiY-UFuqnVMkKIgz5Jgks5qUsRhgaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hello Brooke,

GitHub is very strict on what files can be attached. Can you just email them to me directly at donovan.parks [at] gmail.com.

Thanks, Donovan

ceya commented 8 years ago

Done!

On Tue, Jul 12, 2016 at 9:46 AM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

GitHub is very strict on what files can be attached. Can you just email them to me directly at donovan.parks [at] gmail.com.

Thanks, Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-232090407, or mute the thread https://github.com/notifications/unsubscribe/AM5Zux8rqxum5Co8921bvr06RGUzeQFTks5qU7bAgaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hello Brooke,

CompareM processes your genomes fine on my end. I am running the program as follows: comparem aai_wf test aai_output --file_ext faa --proteins --cpus 40

The file_ext flag indicates that *.faa files should be processed and the proteins flag indicates the FASTA files already contain proteins so gene calling is not necessary. The results indicate that all these genomes are nearly identical (AAI >= 99.98%).

Can you verify that you have the latest version of CompareM (v0.0.16) and DIAMOND >= v0.8.10?

Cheers, Donovan

ceya commented 8 years ago

Hi Donovan,

I've compareM 0.0.16 and diamond v0.8.10.72. I ran the same command except for the --cpu setting and still have 0 orthologs.

Could it be that the program wasn't installed correctly? I send the whole aai_output folder to your gmail account. Could you take a look at the files?

Thank you very much! Brooke

On Tue, Jul 12, 2016 at 11:37 AM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

CompareM processes your genomes fine on my end. I am running the program as follows: comparem aai_wf test aai_output --file_ext faa --proteins --cpus 40

The file_ext flag indicates that *.faa files should be processed and the proteins flag indicates the FASTA files already contain proteins so gene calling is not necessary. The results indicate that all these genomes are nearly identical (AAI >= 99.98%).

Can you verify that you have the latest version of CompareM (v0.0.16) and DIAMOND >= v0.8.10?

Cheers, Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-232120951, or mute the thread https://github.com/notifications/unsubscribe/AM5Zu5CKRdsNZPsamVvQtppeK7AZjx8yks5qU9DagaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hello Brooke,

I think it may have to do with different flavours of Linux. What OS are you using?

Cheers, Donovan

donovan-h-parks commented 8 years ago

Can you also try running CompareM on just the 4 genome you sent me and then sending me the output directory for this? This way I can directly compare my output to your to see where they differ.

ceya commented 8 years ago

I'm using a terminal with 64-bit Linux CentOS

On Tue, Jul 12, 2016 at 1:01 PM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

I think it may have to do with different flavours of Linux. What OS are you using?

Cheers, Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-232146005, or mute the thread https://github.com/notifications/unsubscribe/AM5ZuzG-XUwLWguDkZn7VTD3aFMRrYzTks5qU-R_gaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hello Brooke,

Thank you for helping with the debugging. It does appear to be a Linux flavour issue. Specifically, it has to do with the sort command of the OS. CompareM uses sort to index the resulting DIAMOND hits file and I can see that on your machine the file is not actually sorted. Can you check which sort you have on your path (i.e., run which sort and let me know what it returns). I'm digging around the net to see if this is a CentOS issue, but I suspect you may just have a different sort program on your path other than the standard Unix program.

Thanks Donovan

ceya commented 8 years ago

Hi Donovan,

Here's the output: [brooke@tatanka comparem]$ which sort /bin/sort

On Tue, Jul 12, 2016 at 1:31 PM, Donovan Parks notifications@github.com wrote:

Hello Brooke,

Thank you for helping with the debugging. It does appear to be a Linux flavour issue. Specifically, it has to do with the sort command of the OS. CompareM uses sort to index the resulting DIAMOND hits file and I can see that on your machine the file is not actually sorted. Can you check which sort you have on your path (i.e., run which sort and let me know what it returns). I'm digging around the net to see if this is a CentOS issue, but I suspect you may just have a different sort program on your path other than the standard Unix program.

Thanks Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-232153971, or mute the thread https://github.com/notifications/unsubscribe/AM5Zu7CtxnbChPDe9Tq8SYbIHeeL1IGQks5qU-uGgaJpZM4JFb8c .

donovan-h-parks commented 8 years ago

Hey,

Seem like this may be a bug with the CentOS implementation of sort: https://bugs.centos.org/view.php?id=9289

Perhaps you need to update to coreutils-8.22-13.el7.

Cheers, Donovan

ceya commented 8 years ago

Thanks Donovan! I'll contact the admins and let them know.

Regards, Brooke

On Tue, Jul 12, 2016 at 1:43 PM, Donovan Parks notifications@github.com wrote:

Hey,

Seem like this may be a bug with the CentOS implementation of sort: https://bugs.centos.org/view.php?id=9289

Perhaps you need to update to coreutils-8.22-13.el7.

Cheers, Donovan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dparks1134/CompareM/issues/6#issuecomment-232157222, or mute the thread https://github.com/notifications/unsubscribe/AM5ZuwcIgu9a5camxAyc8-GWU0wG5F4yks5qU-5ugaJpZM4JFb8c .

frederic-foucault commented 7 years ago

Hello Same results here (no Orthologs found) in OSX 10.11.6 every required binaries were installed or compiled(Diamond)

which sort sort is /usr/bin/sort

juliambrosman commented 5 years ago

Incase this is useful for future strugglers, installing comparem into a conda env with bioconda coreutils overcame this issue for me.

conda create -n comparem python=2.7 scipy  
conda install --name comparem -c bioconda coreutils  
source activate comparem  
pip install comparem