tripal / tripal

The Tripal package is a suite of Drupal modules for creating biological (genomic, genetic, breeding) websites. Visit the Tripal homepage at http://tripal.info for documentation, support, and other information. The Drupal project page is at http://drupal.org/project/tripal.
GNU General Public License v2.0
66 stars 49 forks source link

No sequence from alignment in Tripal 3.1 templates? #891

Closed mdondrup closed 3 years ago

mdondrup commented 5 years ago

BUG/ERROR report

System information

Issue description

After updating Tripal 2.1 -> 3.1 it seems that the new templates do not show sequence derived from alignments, while legacy templates do display sequence.

Example:

https://blowfly-test.cbu.uib.no/feature/Tigriopus/kingsejongensis/gene/maker-scaffold5_size1054832-snap-gene-3.9

Steps to reproduce

Error messages and screenshots

No sequence is available.

image

Same gene in prod: image

bradfordcondon commented 5 years ago

I could have sworn we have an open an issue for this but I can't find it.

https://github.com/tripal/tripal/blob/dd88e9fa9b497fe046913e97110b4f4721431edc/tripal_chado/includes/TripalFields/data__sequence/data__sequence.inc#L81-L176

You can see the field has the code to do this, but it's all commented out. Im guessing for performative reason? If you enable this code do you have what you want?

I'd like this addressed as well.

mdondrup commented 5 years ago

Ok, thank you @bradfordcondon. I have edited the file as suggested and cleared caches. Now I am getting an Ajax spinner at least, not sure if that is related to #892?

A warning associated with the Ajax call appears in the log:

TYPE php DATE Friday, March 15, 2019 - 10:19 USER Administrator LOCATION https://blowfly-test.cbu.uib.no/bio_data/ajax/field_attach/tripal-entity-2622--data__sequence_coordinates REFERRER https://blowfly-test.cbu.uib.no/bio_data/2622 MESSAGE Warning: Invalid argument supplied for foreach() in file_entity_set_title_alt_properties_on_file_fields() (line 248 of /home/licebase/d7/sites/all/modules/file_entity/file_entity.file.inc). SEVERITY warning

image

bradfordcondon commented 5 years ago

definitely not related to #892 . Warning also looks unrelated.

I think that theres a reason the code is commented out :P

mdondrup commented 5 years ago

Guess there is. Looking at the code it used direct SQL queries, wouldn't it make sense to separate front-end and back-end logic by using Tripal API calls for getting the seqs? I remember there was an API call to retrieve all sequences per feature reliably.

laceysanderson commented 5 years ago

I think the original issue was #108? However, I think there is more info in this issue now... perhaps we should close the other as a duplicate?

Also, I believe the API function is chado_get_feature_sequences()

mdondrup commented 5 years ago

I have experimented a bit and came up with a quick fix that works for testing. But there are some issues left:

 /**
   * @see TripalField::load()
   */
  public function load($entity) {
    $field_name = $this->field['field_name'];
    $feature = $entity->chado_record;
    $feature = chado_expand_var($feature, 'field', 'feature.residues');
    if (empty ($feature->residues)) {
        $seqs = chado_get_feature_sequences(array('feature_id' => $feature->feature_id), array('derive_from_parent' => 1, 'aggregate' => 0, 'is_html' => 1, 'width' => 50));
        if (!empty($seqs[0]['residues'])) {
          $entity->{$field_name}['und'][0]['value'] =  $seqs[0]['residues'];
        } else {
        //  $entity->{$field_name}['und'][0]['value'] = "No sequence from alignment";
        }
    } else {
        $entity->{$field_name}['und'][0]['value'] =  $feature->residues ;
    }
    // This is to demonstrate the problem with multiple sequences
    // $entity->{$field_name}['und'][1]['value'] = "This will never be displayed, but prevents the default value to appear if no sequence is found";
laceysanderson commented 5 years ago

The entity interface doesn't support displaying multiple sequences. In the node interface, additional items could simply be added to the list and displayed adding HTML output. The commented out code tried to add more sequences to the output by assigning like this: $entity->{$field_name}['und'][$num_seqs++]['value'] = ..., but this doesn't work, possibly the entity's field needs to be defined as multivalue to allow other values than $entity->{$field_name}['und'][0]['value']

We often supply multiple values to fields in the exact manner you're describing (example). Looks like it's not working for this field due to the corresponding formatter which should loop through each value but doesn't... That should get the multiple sequence support you're looking for.

I think you can get the fasta header by using chado_get_fasta_defline.

Unfortunately I don't have any data for testing/developing this since I deal with breeding data... Do you have a test fasta/gff3 I could use?

mdondrup commented 5 years ago

Hi, here is a small example from the Ls Rhabdovirus genome. There is only a single exon per gene for now. Hope this works out.

lsrhab9.gb_.gff.sorted.fasta.zip lsrhab9.gb_.gff.sorted.zip

laceysanderson commented 5 years ago

Hi @mdondrup,

You can check out 891-tv3-derived_sequence for a simplified sequence field which works for both

It will support multiple sequences and uses the chado_get_feature_sequences() API function. It also shows the fasta record with the fasta defline. It also removes the height restriction on the sequence field.

Protein residues and coding sequence are handled by separate fields (data__protein_sequence and so_cds respectively)

Do these fields now meet your needs?

spficklin commented 5 years ago

@mdondrup and @laceysanderson I'm just following up on this.
@mdondrup can you confirm if branch @laceysanderson mention's fixes the issue for you. @laceysanderson do you want to issue a PR for that fix?

webfaqtory commented 5 years ago

The sequence display works on the demo site. See http://demo-3x.tripal.info/bio_data/609

Tried the 3.2-dev branch and we still get "There is no sequence." for a mRNA that does have a sequence display in V2

spficklin commented 5 years ago

@webfaqtory. Can you tell me how you loaded your mRNA sequences? Did you load them via a FASTA file or were you hoping they would show up due to an alignment to a whole genome sequence? If the latter, how did you load your mRNA... via GFF file?

toefish commented 5 years ago

@spficklin. The mRNA sequences were originally loaded via FASTA files, with a follow-up loading of a GFF file that delineates the different features of the mRNA. The site @webfaqtory is working on is an upgrade from Tripal 2.x to Tripal 3.2. The 2.x site displayed the sequences fine. It does seem odd that sequence display seems to work on the demo 3.x site, but not on our 3.2 site.

spficklin commented 5 years ago

@toefish sorry for the slow reply. I was on vacation. Could you share snippets of the FASTA file and GFF file for those sequences?

laceysanderson commented 4 years ago

@toefish, @webfaqtory I'm just following up on this, can you update us on your situation and perhaps provide file snippets for testing purposes?

laceysanderson commented 3 years ago

Closing due to no activity. Feel free to comment back here if you still need help and I'll re-open