erikrikarddaniel / pfitmap

1 stars 0 forks source link

An organism may have several entries in a single database #101

Closed erikrikarddaniel closed 10 years ago

erikrikarddaniel commented 10 years ago

Thermotoga maritima MSB8 appears to have four RefSeq and three GenBank entries for its NrdDh1 protein:

select dbe.db,dbe.acc,hp.protein_name,hrr.fullseq_score,ps.id from db_entries dbe join hmm_result_rows hrr on dbe.db_sequence_id = hrr.db_sequence_id join hmm_results hr on hrr.hmm_result_id = hr.id join hmm_profiles hp on hr.hmm_profile_id = hp.id join sequence_sources ss on hr.sequence_source_id = ss.id left join pfitmap_sequences ps on dbe.db_sequence_id = ps.db_sequence_id and hp.id = ps.hmm_profile_id where dbe.db_sequence_id = 68463 and ss.version = '2013-08-26' and hp.protein_name ~ 'NrdDh';

Count only one unique db_sequence object. Choose the first one in alphabetical accession number order.

Failing test case in commit b01670be74305dcf8e236aae58015a09609e393c, branch hot-fixes.

erikrikarddaniel commented 10 years ago

Fixed in 40c02ff342b02b202e43b3406e1b81056753c3c7.