mills-lab / spectre

Spectral coherence classification of actively translated regions in ribosome profiling sequence data.
BSD 3-Clause "New" or "Revised" License
6 stars 3 forks source link

A sample of test file doesn't work with Spectre #4

Closed gearhq closed 8 years ago

gearhq commented 8 years ago

Hello,

I'm realy interested to use this program on my analysis but the same errors reported here are ocurring on my custom gtf file of putative long non-coding RNAs.

Spectre works fine with complete test data, but when I try to run with a sample of 'Homo_sapiens.GRCh38.78.test.gtf' test data it doesn't. I try to select only 'snoRNA' and 'lincRNA' to test analysis with non-coding RNAs Spectre crashes with the mensage:

Traceback (most recent call last): File "SPECtre.py", line 1323, in transcript_metrics, reference_read_distribution = calculate_transcript_scores(transcript_gtf, transcript_fpkms, float(args.min), asite_buffers, psite_buffers, args.input, int(args.len), int(args.step), args.type, analyses, int(args.nt)) File "SPECtre.py", line 922, in calculate_transcript_scores reference_distribution = calculate_reference_distribution(protein_coding_distributions) File "SPECtre.py", line 804, in calculate_reference_distribution reference_distribution[read_length] /= reference_transcripts ZeroDivisionError: integer division or modulo by zero

When I try with only one entry it crashes with the following mensage:

Traceback (most recent call last): File "SPECtre.py", line 1323, in transcript_metrics, reference_read_distribution = calculate_transcript_scores(transcript_gtf, transcript_fpkms, float(args.min), asite_buffers, psite_buffers, args.input, int(args.len), int(args.step), args.type, analyses, int(args.nt)) File "SPECtre.py", line 871, in calculate_transcript_scores transcripts, coordinates = zip(*flatten(gtf).iteritems()) ValueError: need more than 0 values to unpack

Spectre was supposed to work with only a sample of the entries, right? Or it have a minimum number of entries to work with?

Best regards.

stonyc commented 8 years ago

SPECtre requires protein-coding transcripts in order to build the distributions to identify translated from non-translated transcripts. Therefore removing protein-coding transcripts from the sample GTF will result in an error as the translational distributions (used to score all other types of transcripts) cannot be built. Non-coding RNAs, pseudogenes, lncRNAs, etc. will be scored as part of the default protocol.