schatzlab / scikit-ribo

Accurate estimation and robust modelling of translation dynamics at codon resolution
GNU General Public License v2.0
18 stars 8 forks source link

Bug creating file name for gtfDedup file in gtf_process.py #5

Open catsargent opened 6 years ago

catsargent commented 6 years ago

Hi,

I installed scikit-ribo using pip and am using it with human data. In order to generate the RNA fold file, I am first running gtf_process.py on our GTF file and in doing so obtained the following error:

[status] Reading the input file: XXX/single_CDS_translatedGenes.sorted.gtf [execute] Starting the pre-processing module [execute] Loading the the gtf file in to sql db sys:1: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False. Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/scikit-ribo/gtf_preprocess.py", line 266, in worker.convertGtf() File "/usr/local/lib/python3.5/dist-packages/scikit-ribo/gtf_preprocess.py", line 44, in convertGtf gtfDedup = self.output + "/" + self.prefix + '.dedup.gtf' TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

I realised that the gtf_process.py file that is on github is different to the version that I installed with pip.

I managed to fix the problem by changing: def init(self, gtf=None, fasta=None, prefix=None, output=None):

to:
def init(self, gtf=None, fasta=None, output=None, prefix=None):

and adding in the lines (which are in the version on github but not the version I installed using pip): self.base = os.path.basename(self.gtf) self.prefix = os.path.splitext(self.base)[0]

Regards, Catherine