tripal / tripal_blast

Provides an user interface to BLAST on Tripal sites.
https://tripal.github.io/tripal_blast/
5 stars 13 forks source link

[Question] Is the blastn blast in working state? #113

Open shreyas-a-s opened 1 month ago

shreyas-a-s commented 1 month ago

Hi there!

I have tried multiple ways including running the docker version of blast and every time I am not able to complete the blast run. It probably is something I am doing differently, but I am unable to find out the reason myself.

Obtained Result:

  1. Error after clicking BLAST: image
  2. Error shown on home page after the run: image
  3. Error while running drush command: image

Steps I followed:

  1. Run the docker version using the commands provided in README.
  2. Run bash inside the docker:
    docker exec -it <CONTAINER ID> bash
  3. Download nelumbo nucifera fasta file from ncbi using curl:
    curl -O 'https://api.ncbi.nlm.nih.gov/datasets/v2alpha/genome/accession/GCF_000365185.1/download?include_annotation_type=GENOME_FASTA&include_annotation_type=GENOME_GFF&include_annotation_type=RNA_FASTA&include_annotation_type=CDS_FASTA&include_annotation_type=PROT_FASTA&include_annotation_type=SEQUENCE_REPORT&hydrated=FULLY_HYDRATED' -o nelumbo-nucifera.zip
    unzip nelumbo-nucifera.zip
    mv ncbi_dataset/data/GCF_000365185.1/GCF_000365185.1_Chinese_Lotus_1.1_genomic.fna ./nelumbo-nucifera.fna
  4. Prepare it using the command:
    makeblastdb -dbtype nucl -parse_seqids -hash_index -in nelumbo-nucifera.fna -input_type fasta -title "Nelumbo Nucifera DB Test" -out nelumbo_nucifera
  5. Go to tripal_blast code and made the form elements optional. Just like this.
  6. Put the details in blast database form in drupal website at localhost:80
  7. Run blastn with query sequence:
    >partial lipoxygenase Glyma15g03040
    TTTCGTATGA GATTAAAATG TGTGAAATTT TGTTTGATAG GACATGGGAA
    AGGAAAAGTT GGAAAGGCTA CAAATTTAAG AGGACAAGTG TCGTTACCAA
    CCTTGGGAGC TGGCGAAGAT GCATACGATG TTCATTTTGA ATGGGACAGT
    GACTTCGGAA TTCCCGGTGC ATTTTACATT AAGAACTTCA TGCAAGTTGA
    GTTCTATCTC AAGTCTCTAA CTCTCGAAGA CATTCCAAAC CACGGAACCA
    TTCACTTCGT ATGCAACTCC TGGGTTTACA ACTCAAAATC CTACCATTCT
    GATCGCATTT TCTTTGCCAA CAATGTAAGC TACTTAAATA CTGTTATACA
    TTGTCTAACA TCTTGTTAGA GTCTTGCATG ATGTGTACCG TTTATTGTTG
    TTGTTGAACT TTACCACATG GCATGGATGC AAAAGTTGTT ATACACATAA
    ATTATAATGC AGACATATCT TCCAAGCGAG ACACCGGCTC CACTTGTCAA
    GTACAGAGAA GAAGAATTGA AGAATGTAAG AGGGGATGGA ACTGGTGAGC
    GCAAGGAATG GGATAGGATC TATGATTATG ATGTCTACAA TGACTTGGGC
    GATCCAGATA AGGGTGAAAA GTATGCACGC CCCGTTCTTG GAGGTTCTGC
    CTTACCTTAC CCTCGCAGAG GAAGAACCGG AAGAGGAAAA ACTAGAAAAG
    GTTTCTCACT AGTCACTAAT TTATTACTTT TTAATGTTTG TTTTTAGGCA
    TCTTTTCTGA TGAAATGTAT ACTTTTGATG TTTTTTTGTT TTAGCATAAC
    TGAATTAGTA AAGTGTGTTG TGTTCCTTAG AAGTTAGAAA AGTACTAAGT
    ATAAGGTCTT TGAGTTGTCG TCTTTATCTT AACAGATCCC AACAGTGAGA
    AGCCCAGTGA TTTTGTTTAC CTTCCGAGAG ATGAAGCATT TGGTCACTTG
    AAGTCATCAG ATTTTCTCGT TTATGGAATC AAATCAGTGG CTCAAGACGT
    CTTGCCCGTG TTGACTGATG CGTTTGATGG CAATCTTTTG AGCCTTGAGT
    TTGATAACTT TGCTGAAGTG CGCAAACTCT ATGAAGGTGG AGTTACACTA
    CCTACAAACT TTCTTAGCAA GATCGCCCCT ATACCAGTGG TCAAGGAAAT
    TTTTCGAACT GATGGCGAAC AGTTCCTCAA GTATCCACCA CCTAAAGTGA
    TGCAGGGTAT GCTACATATT TTGAATATGT AGAATATTAT CAATATACTC
    CTGTTTTTAT TCAACATATT TAATCACATG GATGAATTTT TGAACTGTTA
laceysanderson commented 1 month ago

Thank you for this and for the detailed steps to reproduce! Sorry for the delay in getting back to you -it was a long weekend here in Canada. I'll have to look a bit deeper into the module to see where it is at for the actually running of a blast job functionality-wise.

I'll get back to you likely tomorrow but feel free to follow up often as there is a lot on my plate so sometimes things get dropped. The state of this module is in big part because of all the work I've been doing on Tripal core and there are only so many hours in the day unfortunately! 😅

shreyas-a-s commented 1 month ago

I totally get it, @laceysanderson. We all have only so much time per day left to work on projects, right :) It's totally fine.

Look into it when you get some extra time. In the meantime, I will continue testing the same and add any more info that I am gathering in the discussion.

laceysanderson commented 1 month ago

Following up here: The blast is definitely not in a working state just yet

Reproducible

Following a similar procedure to @shreyas-a-s above... specifically:

  1. Using a docker container for this module, I went to admin/tripal/extension/tripal_blast and clicked on "Add Tripal BLAST Database". I then filled out the form using a test blast database included in this module as follows: image NOTE: The full path in the docker to the test database is: /var/www/drupal/web/modules/contrib/tripal_blast/tests/fixtures/Chlamydomonas_reinhardtii_v5.6/Chlamydomonas_reinhardtii_v5.6
  2. Then I went to blastn through the UI and submitted a job using this target database and the query used by @shreyas-a-s above. image
  3. When I submitted the job I got the same error message. image

I would have expected to see a page telling me that the job was submitted and I was in a queue. With that expectation in mind I went to admin/tripal/tripal_jobs to check to see if the job was submitted. ✅ It was! But I was then shown all these errors/warnings: image

I then ran the blast job on the command-line and observed: Screenshot 2024-10-08 at 12 24 44 PM

Next Steps

TWIG updates needed

The error shown in step 3 indicates that our twig files are not using the correct syntax for comparison. I would guess this is because they were developed against a much older version of twig. To fix, we'll want to update the files in the tripal_blast/templates folder to use the correct comparison syntax. According to the twig error message, this means that line 19 of templates/template-tripal-blast-report-pending.html.twig which is currently {% if job.status_code === 0 %} should use the twig is same as(value) notation instead.

Blastn service issues with advanced parameters

This is implied by these errors on the page loaded after the job is submitted.

Warning: Undefined array key "gapCost" in Drupal\tripal_blast\Services\TripalBlastProgramBlastn->formFieldBlastKey() (line 164 of modules/contrib/tripal_blast/src/Services/TripalBlastProgramBlastn.php).

likely caused by a disagreement between the name of the form element and the name the service expects the value to have. These need to match so the form would be updated to match.

Deprecated function: explode(): Passing null to parameter #2 ($string) of type string is deprecated in Drupal\tripal_blast\Services\TripalBlastProgramHelper::programSetGap() (line 203 of modules/contrib/tripal_blast/src/Services/TripalBlastProgramHelper.php).

This implies we need a check in the programSetGap to only explode if we are able to retrieve the value. This is directly linked to the first one.

Warning: Undefined array key 1 in Drupal\tripal_blast\Services\TripalBlastProgramHelper::programSetGap() (line 204 of modules/contrib/tripal_blast/src/Services/TripalBlastProgramHelper.php).

This is also directly linked to the first one. We tried to access the first value in our exploded array without checking to see if it was there.

Look in the files mentioned to fix these issues.

Blast report controller not finding report variable

This is an assumption that doesn't hold true in the controller. We assumed that a report variable was always available but it is actually only set in one case. I would fix this by defining the $report variable outside of the if block and setting it to NULL.

Blast Tripal Job not receiving all the parameters it needs.

The blast job expects to have the output filename passed to it but the code that submits the job does not populate that. In Tripal 3 this was calculated based on configuration and the value of the job. This needs to be fixed in the code that submits the Tripal job.