af-lab / histone-catalogue

Core histone catalogue --- Live manuscript
1 stars 0 forks source link

build failure with bioperl 1.007001 #36

Closed aflaus closed 7 years ago

aflaus commented 7 years ago
scons: Reading SConscript files ...
Checking for bp_genbank_ref_extractor...(cached) /usr/local/bin/bp_genbank_ref_extractor
Checking for weblogo...(cached) /usr/local/bin/weblogo
Checking if weblogo supports --number-interval...yes
Checking for perl module Bio::AlignIO...(cached) yes
Checking for perl module Bio::Align::Utilities...(cached) yes
Checking for perl module Bio::CodonUsage::Table...(cached) yes
Checking for perl module Bio::DB::EUtilities...(cached) yes
Checking for perl module Bio::LocatableSeq...(cached) yes
Checking for perl module Bio::Root::Version...(cached) yes
Checking for perl module Bio::Seq...(cached) yes
Checking for perl module Bio::SeqIO...(cached) yes
Checking for perl module Bio::SeqUtils...(cached) yes
Checking for perl module Bio::SimpleAlign...(cached) yes
Checking for perl module Bio::Tools::CodonTable...(cached) yes
Checking for perl module Bio::Tools::EUtilities...(cached) yes
Checking for perl module Bio::Tools::Run::Alignment::Clustalw...(cached) yes
Checking for perl module Bio::Tools::Run::Alignment::TCoffee...(cached) yes
Checking for perl module Bio::Tools::Run::Phylo::PAML::Codeml...(cached) yes
Checking for perl module Bio::Tools::SeqStats...(cached) yes
Checking for perl module File::Which...(cached) yes
Checking for perl module Moose...(cached) yes
Checking for perl module Moose::Util::TypeConstraints...(cached) yes
Checking for perl module MooseX::StrictConstructor...(cached) yes
Checking for perl module namespace::autoclean...(cached) yes
Checking for perl module Statistics::Basic...(cached) yes
Checking for perl module Test::Exception...(cached) yes
Checking for perl module Test::Output...(cached) yes
Checking for perl module Text::CSV...(cached) yes
Checking for LaTeX package fontenc...(cached) yes
Checking for LaTeX package inputenc...(cached) yes
Checking for LaTeX package graphicx...(cached) yes
Checking for LaTeX package url...(cached) yes
Checking for LaTeX package todonotes...(cached) yes
Checking for LaTeX package natbib...(cached) yes
Checking for LaTeX package color...(cached) yes
Checking for LaTeX package kpfonts...(cached) yes
Checking for LaTeX package seqsplit...(cached) yes
Checking for LaTeX package eqparbox...(cached) yes
Checking for LaTeX package capt-of...(cached) yes
Checking for LaTeX package hyperref...(cached) yes
Checking for LaTeX package fp...(cached) yes
Checking for LaTeX package afterpage...(cached) yes
Checking for LaTeX package isodate...(cached) yes
Checking for LaTeX package etoolbox...(cached) yes
Checking for LaTeX package stringstrings...(cached) yes
Checking for LaTeX package intcalc...(cached) yes
Checking for LaTeX package siunitx...(cached) yes
Checking for LaTeX document class memoir...(cached) yes
Checking for BibTeX style agu...(cached) yes
Checking e-mail address...(cached) andrew.flaus@nuigalway.ie
scons: done reading SConscript files.
scons: Building targets ...
rm -r results/sequences
$ bp_genbank_ref_extractor '--assembly' 'Reference GRC' '--genes' 'uid' '--pseudo' '--non-coding' '--upstream' '500' '--downstream' '500' '--transcripts' 'accession' '--proteins' 'accession' '--limit' '300' '--format' 'genbank' '--save' 'results/sequences' '--save-data' 'csv' '--email' 'andrew.flaus@nuigalway.ie' '"Homo sapiens"[organism] AND (H1*[gene name] OR H2A*[gene name] OR H2B*[gene name] OR H3*[gene name] OR H4*[gene name] OR HIST1*[gene name] OR HIST2*[gene name] OR HIST3*[gene name] OR HIST4*[gene name] OR CENPA[gene name])'
Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/\G{ <-- HERE / at /usr/share/perl5/Bio/ASN1/EntrezGene.pm line 148.
This is bp_genbank_ref_extractor on Bioperl 1.007001 on [2016-12-23 22:12:12]
Searching on Entrez gene...
Fetching gene info...
WARNING: gene with UID='8023' is of type 'unknown' . Skipping...
WARNING: gene with UID='283120' is of type 'ncRNA' . Skipping...
WARNING: gene with UID='105259599' is of type 'other' . Skipping...
WARNING: gene with UID='339942' is of type 'ncRNA' . Skipping...
WARNING: gene with UID='84848' is of type 'ncRNA' . Skipping...
WARNING: gene with UID='85495' is of type 'ncRNA' . Skipping...
Fetching gene sequences...
Fetching transcript sequences...
Fetching protein sequences...
$ perl '-Ilib-perl5' '-Iscripts' '-MHistoneSequencesDB' '-e' 'HistoneSequencesDB->new("results/sequences")->write_db("results/histones_db.store")'
$ perl '-Ilib-perl5' '-Iscripts' 'scripts/align_proteins.pl' 'results/histones_db.store' 'H2A' 'results/aligned_H2A_proteins.fasta'
$ perl '-Ilib-perl5' '-Iscripts' 'scripts/align_transcripts.pl' 'results/histones_db.store' 'results/aligned_H2A_proteins.fasta' 'results/aligned_H2A_cds.fasta'
Can't call method "display_id" on an undefined value at scripts/align_transcripts.pl line 80.
scons: *** [results/aligned_H2A_cds.fasta] Error 25
scons: building terminated because of errors.
aflaus commented 7 years ago

Is there a problem with the link between proteins and transcripts due to a RefSeq ID problem? Or is there some unflagged dependency?

If it's a RefSeq ID issue then maybe the code could check the issues and report it, because if this is an issue then it's likely to arise again.

carandraug commented 7 years ago

My guess is that this is an issue with your version of perl or bioperl. I would guess on the later. When I'm back at Oxford I will get a VM to try and replicate it.

Shouldn't be a problem with Refseq because a fresh build works here. Any dependency not checked by scons would fail to even execute. The error Can't call method "display_id" on an undefined value at... means that at some point there should be a sequence object but instead that value is empty, so I'm guessing something changed on the latest bioperl release.

carandraug commented 7 years ago

This issue was fixed with a4dff55 . The problem is that the latest bioperl has fixed a bug that we were working around (see bioperl/bioperl-live#137). I made the workaround conditional so it still works for people that install bioperl using their distribution package manager.