Closed aaiezza closed 8 years ago
Hi thanks for reporting this!
The error you are mentioning
"ERROR: Cannot set a frame with no defined index and a value that cannot be converted to a Series " is already fixed in latest version 0.9.7.
The warning about convert_objects has no effect on the output.
Could you please try to upgrade to the latest version 0.9.7 and let me know if you still get the fatal error?
Thanks!
Oh dear. Forgive me for posting without noticing that first.
CRISPRessoPooled \
--fastq_r1 A5_S184_L001_R1_001.fastq.gz \
--fastq_r2 A5_S184_L001_R2_001.fastq.gz \
--amplicons_file amplicons_description.txt \
--bowtie2_index /data/ref_genome/mouse/musculus \
--gene_annotations /data/ref_genome_annot/ucsc/mouse/vMM10.annotation.gz \
--n_processes 4 \
--name A5_S184 \
--trim_sequences \
--max_paired_end_reads_overlap 250 \
--exclude_bp_from_left 10 \
--exclude_bp_from_right 10 \
--output_folder cspresso/A5_S184 \
--save_also_png
Sorry it's so long. They may be other problems here at play.
INFO @ Wed, 13 Jul 2016 15:42:58:
Checking dependencies...
INFO @ Wed, 13 Jul 2016 15:42:58:
All the required dependencies are present!
INFO @ Wed, 13 Jul 2016 15:42:58:
Amplicon description file and bowtie2 reference genome index files provided. The analysis will be perfomed using the reads that are aligned ony to the amplicons provided and not to other genomic regions.
INFO @ Wed, 13 Jul 2016 15:42:58:
Creating Folder /gpfs/fs2/scratch/aaiezza/amplicon_exp/cspresso/A5_S184/CRISPRessoPooled_on_A5_S184
WARNING @ Wed, 13 Jul 2016 15:42:58:
Folder /gpfs/fs2/scratch/aaiezza/amplicon_exp/cspresso/A5_S184/CRISPRessoPooled_on_A5_S184 already exists.
INFO @ Wed, 13 Jul 2016 15:42:58:
Trimming sequences with Trimmomatic...
INFO @ Wed, 13 Jul 2016 15:43:00:
Done!
INFO @ Wed, 13 Jul 2016 15:43:00:
Merging paired sequences with Flash...
INFO @ Wed, 13 Jul 2016 15:43:01:
Done!
INFO @ Wed, 13 Jul 2016 15:43:01:
Loading gene coordinates from annotation file: /gpfs/fs2/scratch/aaiezza/data/ref_genome_annot/ucsc/mouse/vMM10.annotation.gz...
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:03:
The amplicon [Fmn1] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:04:
The amplicon [Dntt] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:05:
The amplicon [Ankrd10] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:07:
The amplicon [Mt1] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:08:
The amplicon [Psmd13] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:10:
The amplicon [Asap1] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:11:
The amplicon [chr10_1] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:12:
The amplicon [chr14] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:14:
The amplicon [chr13] is not mappable to the reference genome provided!
1 reads; of these:
1 (100.00%) were unpaired; of these:
1 (100.00%) aligned 0 times
0 (0.00%) aligned exactly 1 time
0 (0.00%) aligned >1 times
0.00% overall alignment rate
INFO @ Wed, 13 Jul 2016 15:43:15:
The amplicon [chr10] is not mappable to the reference genome provided!
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Fmn1 provided:
GTCTTTGAGTTTGGGCAGAATTTCTAAACTATATCCGTCTGCTTGCCCTCGCGTCCTGTTTCCTCCATTCATGTAGTTTCCAAAAGCCAGAATGAGAGCTAAGATATCCTTCACACTCTTCATGTGCAACAAGCCCTGCAGTGGCAAACACCAGCAGTTATGGTCTGAACTACAAACACAGTCATATTCGCTCGCTCCACAGCTAACCCTCATTAAAGTAACCAAATCCTGGTAAGTGGCTTGTGATTAGTTGTATCAACAGTTGGCAAATACAAGATACATTTCACTACAGCAGTATCATGTGGG
is different from the reference sequence(both strand):
GTCTTTGAGTTTGGGCAGAATTTCTAAACTATATCCGTCTGCTTGCCCTCGCGTCCTGTTTCCTCCATTCATGTAGTTTCCAAAAGCCAGAATGAGAGCTAAGATATCCTTCACACTCTTCATGTGCAACAAGCCCTGCAGTGGCAAACACCAGCAGTTATGGTCTGAACTACAAACACAGTCATATTCGCTCGCTCCACAGCTAACCCTCATTAAAGTAACCAAATCCTGGTAAGTGGCTTGTGATTAGTTGTATCAACAGTTGGCAAATACAAGATACATTTCACTACAGCAGTATCATGTGGG
CCCACATGATACTGCTGTAGTGAAATGTATCTTGTATTTGCCAACTGTTGATACAACTAATCACAAGCCACTTACCAGGATTTGGTTACTTTAATGAGGGTTAGCTGTGGAGCGAGCGAATATGACTGTGTTTGTAGTTCAGACCATAACTGCTGGTGTTTGCCACTGCAGGGCTTGTTGCACATGAAGAGTGTGAAGGATATCTTAGCTCTCATTCTGGCTTTTGGAAACTACATGAATGGAGGAAACAGGACGCGAGGGCAAGCAGACGGATATAGTTTAGAAATTCTGCCCAAACTCAAAGAC
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Dntt provided:
TAGAAACAACGCTCCTACTGTCCATTTATCCACTCAACAGATATTCACACATCACCTGCCGCCTGTTGGGCAGTGATTGACAAGGGTCCAAGCCAATCAGCTTCTTGTTCACATTGAGTTTCTGTTCTAGTACAGGGAGCCCAGAGCTCAGCCACCCGGGAGCTTTGCCCTGAGGAAGAAAGTCACCCCAAAAATTTATGTTAGAGACAGCAGTTTCAAACACCCAAGGGCTTTGGATAGCTCTAAACACCGTGTACCCGAAATAATCTGGACTAGACGGTAATTTGTTTTAATTCTCTTTGTAGCAGTTTGAGAGAGACTTGCGG
is different from the reference sequence(both strand):
TAGAAACAACGCTCCTACTGTCCATTTATCCACTCAACAGATATTCACACATCACCTGCCGCCTGTTGGGCAGTGATTGACAAGGGTCCAAGCCAATCAGCTTCTTGTTCACATTGAGTTTCTGTTCTAGTACAGGGAGCCCAGAGCTCAGCCACCCGGGAGCTTTGCCCTGAGGAAGAAAGTCACCCCAAAAATTTATGTTAGAGACAGCAGTTTCAAACACCCAAGGGCTTTGGATAGCTCTAAACACCGTGTACCCGAAATAATCTGGACTAGACGGTAATTTGTTTTAATTCTCTTTGTAGCAGTTTGAGAGAGACTTGCGG
CCGCAAGTCTCTCTCAAACTGCTACAAAGAGAATTAAAACAAATTACCGTCTAGTCCAGATTATTTCGGGTACACGGTGTTTAGAGCTATCCAAAGCCCTTGGGTGTTTGAAACTGCTGTCTCTAACATAAATTTTTGGGGTGACTTTCTTCCTCAGGGCAAAGCTCCCGGGTGGCTGAGCTCTGGGCTCCCTGTACTAGAACAGAAACTCAATGTGAACAAGAAGCTGATTGGCTTGGACCCTTGTCAATCACTGCCCAACAGGCGGCAGGTGATGTGTGAATATCTGTTGAGTGGATAAATGGACAGTAGGAGCGTTGTTTCTA
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Ankrd10 provided:
TGTGCACAGCGTGCTATTTCACACTAGGAATTGGCAAGAATCTCTAGGGAGTGCCAACACGTTTCCTGAGTCAGACAGTACTGGAAAACACCAGAAGGCCCACGTGGCCTTCTGACATCCAGGACGACCTGCCTGCTGGCATGAGAAGAGCGAAGAGCTTCTTTCTCCTGCCTGACCAGGAAGGGAACAATGCTGTCTCCATAAAGGAGAGGCTCTGGCT
is different from the reference sequence(both strand):
TGTGCACAGCGTGCTATTTCACACTAGGAATTGGCAAGAATCTCTAGGGAGTGCCAACACGTTTCCTGAGTCAGACAGTACTGGAAAACACCAGAAGGCCCACGTGGCCTTCTGACATCCAGGACGACCTGCCTGCTGGCATGAGAAGAGCGAAGAGCTTCTTTCTCCTGCCTGACCAGGAAGGGAACAATGCTGTCTCCATAAAGGAGAGGCTCTGGCT
AGCCAGAGCCTCTCCTTTATGGAGACAGCATTGTTCCCTTCCTGGTCAGGCAGGAGAAAGAAGCTCTTCGCTCTTCTCATGCCAGCAGGCAGGTCGTCCTGGATGTCAGAAGGCCACGTGGGCCTTCTGGTGTTTTCCAGTACTGTCTGACTCAGGAAACGTGTTGGCACTCCCTAGAGATTCTTGCCAATTCCTAGTGTGAAATAGCACGCTGTGCACA
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Mt1 provided:
GGACATTTCTCAGAGCCAGTTTTGTAGGAGTTCCCCGCCCCTAGCCTTAGCCGCCACCCAAGGTGTCCCAACTCACTCTTCTTGCAGGAGGTGCACTTGCAGTTCTTGCAGGCGCAGGAGCTGGTGCAAGTGCAGGAGCCGCCTGGGGAGGAGAAAGAAGACAGCATGAGGGAGGCAGCATTACAGCAGTGGCCAACACCACGAGTCCCGGCTCAGTTCACTAAGTCCTCCTCGGAGCTGCAGGGAGCCTAGCCCCACTTTTCTCCTCACAGGTTAAGTCAGGGATTATGTCTTTGAGTCCCAAGACATAAAGGTCCTTCACCTCTTTCT
is different from the reference sequence(both strand):
GGACATTTCTCAGAGCCAGTTTTGTAGGAGTTCCCCGCCCCTAGCCTTAGCCGCCACCCAAGGTGTCCCAACTCACTCTTCTTGCAGGAGGTGCACTTGCAGTTCTTGCAGGCGCAGGAGCTGGTGCAAGTGCAGGAGCCGCCTGGGGAGGAGAAAGAAGACAGCATGAGGGAGGCAGCATTACAGCAGTGGCCAACACCACGAGTCCCGGCTCAGTTCACTAAGTCCTCCTCGGAGCTGCAGGGAGCCTAGCCCCACTTTTCTCCTCACAGGTTAAGTCAGGGATTATGTCTTTGAGTCCCAAGACATAAAGGTCCTTCACCTCTTTCT
AGAAAGAGGTGAAGGACCTTTATGTCTTGGGACTCAAAGACATAATCCCTGACTTAACCTGTGAGGAGAAAAGTGGGGCTAGGCTCCCTGCAGCTCCGAGGAGGACTTAGTGAACTGAGCCGGGACTCGTGGTGTTGGCCACTGCTGTAATGCTGCCTCCCTCATGCTGTCTTCTTTCTCCTCCCCAGGCGGCTCCTGCACTTGCACCAGCTCCTGCGCCTGCAAGAACTGCAAGTGCACCTCCTGCAAGAAGAGTGAGTTGGGACACCTTGGGTGGCGGCTAAGGCTAGGGGCGGGGAACTCCTACAAAACTGGCTCTGAGAAATGTCC
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Psmd13 provided:
CATGATAGGGCATGGGATCAAGTCACAAAACCAGGAACACGTCTGCTGGAGCAGCAATTTCAGGATTAGGAGGCATCAGCAGCGCAGCCAGCCTGGAAGTGCAGGGCGCAGACCTCAGAGGGCTGCTTTGCTAGGCCTCCACAGCAGTAATCCCACCTTGTGTGCGACAGCGTCCGCTTCCTAAATGGCTTCTGTCCACATAAAACTAGGACAGCATTGGTAAACCCGCAAAGGGCAGGGGCCCAGCCCCCTTACCTGCTGCAAATCCAGCACTCGCGGCTGCACCCACGTCATGTGAACCCGCTTGTCCACCTCGTCTATGCTGCCTCTCACCAGCCCCACCGAGAGTGCCTTCATCACCAGCAACTCCA
is different from the reference sequence(both strand):
CATGATAGGGCATGGGATCAAGTCACAAAACCAGGAACACGTCTGCTGGAGCAGCAATTTCAGGATTAGGAGGCATCAGCAGCGCAGCCAGCCTGGAAGTGCAGGGCGCAGACCTCAGAGGGCTGCTTTGCTAGGCCTCCACAGCAGTAATCCCACCTTGTGTGCGACAGCGTCCGCTTCCTAAATGGCTTCTGTCCACATAAAACTAGGACAGCATTGGTAAACCCGCAAAGGGCAGGGGCCCAGCCCCCTTACCTGCTGCAAATCCAGCACTCGCGGCTGCACCCACGTCATGTGAACCCGCTTGTCCACCTCGTCTATGCTGCCTCTCACCAGCCCCACCGAGAGTGCCTTCATCACCAGCAACTCCA
TGGAGTTGCTGGTGATGAAGGCACTCTCGGTGGGGCTGGTGAGAGGCAGCATAGACGAGGTGGACAAGCGGGTTCACATGACGTGGGTGCAGCCGCGAGTGCTGGATTTGCAGCAGGTAAGGGGGCTGGGCCCCTGCCCTTTGCGGGTTTACCAATGCTGTCCTAGTTTTATGTGGACAGAAGCCATTTAGGAAGCGGACGCTGTCGCACACAAGGTGGGATTACTGCTGTGGAGGCCTAGCAAAGCAGCCCTCTGAGGTCTGCGCCCTGCACTTCCAGGCTGGCTGCGCTGCTGATGCCTCCTAATCCTGAAATTGCTGCTCCAGCAGACGTGTTCCTGGTTTTGTGACTTGATCCCATGCCCTATCATG
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence Asap1 provided:
GTTTGGGTGGCATGAGTTTATCAAATAAGAGGGTAAGGCGTGTCAAAAATGACCACCACACAGAGCCCTCCCAGCTCCAGCCGGTTTGCTCCTGGCTCTGCAGATGAACGAGTCAAGATTATTCCAAGCTCAGCAGTGGTAAACAGCCAGGGATTTCTTTCTGAATTCACCGGAGAGCCCAAACTGCCGCCACCAATTGCTGTTTCAGTTTCCTCCGAAGATTATGTTATGTATCTGCCCCTCCCCTGCCTCCTCCAGCCAAGAGGGGACTATATGAACAATGAGATATTGTGCTCTGGTAAGCA
is different from the reference sequence(both strand):
GTTTGGGTGGCATGAGTTTATCAAATAAGAGGGTAAGGCGTGTCAAAAATGACCACCACACAGAGCCCTCCCAGCTCCAGCCGGTTTGCTCCTGGCTCTGCAGATGAACGAGTCAAGATTATTCCAAGCTCAGCAGTGGTAAACAGCCAGGGATTTCTTTCTGAATTCACCGGAGAGCCCAAACTGCCGCCACCAATTGCTGTTTCAGTTTCCTCCGAAGATTATGTTATGTATCTGCCCCTCCCCTGCCTCCTCCAGCCAAGAGGGGACTATATGAACAATGAGATATTGTGCTCTGGTAAGCA
TGCTTACCAGAGCACAATATCTCATTGTTCATATAGTCCCCTCTTGGCTGGAGGAGGCAGGGGAGGGGCAGATACATAACATAATCTTCGGAGGAAACTGAAACAGCAATTGGTGGCGGCAGTTTGGGCTCTCCGGTGAATTCAGAAAGAAATCCCTGGCTGTTTACCACTGCTGAGCTTGGAATAATCTTGACTCGTTCATCTGCAGAGCCAGGAGCAAACCGGCTGGAGCTGGGAGGGCTCTGTGTGGTGGTCATTTTTGACACGCCTTACCCTCTTATTTGATAAACTCATGCCACCCAAAC
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence chr10_1 provided:
GATGGAGGATGGGAAGAACAATAATTAGAGGGCCACGGTCACGGGATGCGCACAGGCAGAGCTCCTCAGCGCCTCTCAGATGTGAGGCCGAAGCCTAATTATGAAAAGCTGCTGGGTCGGAAGACAGAGGCTGCTGTCTTGGGACATCAGATGCATAAGTGAGATTACTTTTCAGGATAGTGATAAACAACAGGCGTAAACACCCGAGGGAGGGATGGAAAACAGACTCGTGGTCTCTGATGAGGAGATCAGTACCCAGGTTTCGCTCTCCTTAGGGTGACTTCATCAGTGG
is different from the reference sequence(both strand):
GATGGAGGATGGGAAGAACAATAATTAGAGGGCCACGGTCACGGGATGCGCACAGGCAGAGCTCCTCAGCGCCTCTCAGATGTGAGGCCGAAGCCTAATTATGAAAAGCTGCTGGGTCGGAAGACAGAGGCTGCTGTCTTGGGACATCAGATGCATAAGTGAGATTACTTTTCAGGATAGTGATAAACAACAGGCGTAAACACCCGAGGGAGGGATGGAAAACAGACTCGTGGTCTCTGATGAGGAGATCAGTACCCAGGTTTCGCTCTCCTTAGGGTGACTTCATCAGTGG
CCACTGATGAAGTCACCCTAAGGAGAGCGAAACCTGGGTACTGATCTCCTCATCAGAGACCACGAGTCTGTTTTCCATCCCTCCCTCGGGTGTTTACGCCTGTTGTTTATCACTATCCTGAAAAGTAATCTCACTTATGCATCTGATGTCCCAAGACAGCAGCCTCTGTCTTCCGACCCAGCAGCTTTTCATAATTAGGCTTCGGCCTCACATCTGAGAGGCGCTGAGGAGCTCTGCCTGTGCGCATCCCGTGACCGTGGCCCTCTAATTATTGTTCTTCCCATCCTCCATC
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence chr14 provided:
CCCTGAGATCAACACTGTCTTCCCACACAAAATGCTCACGCTGCCATTTAATGTCAGGTAAACAGACTTGTACTTAGTAAAAGCTTCGTGGAATTGTTCATCTCTACAGAGGGCAGCCACCAGCAACCTACTGGATCAGGAACCCACGCACCATCAAAGAGGAAAAGCATCCGTGGTAAACACCCGAGGTGATGAACCTGCTCCCAAAGAGCAAAGACAAAAACTAACTCAACCTGCCGCACAGACACACATGCTCGTTCTTTTTTTTTCTTTTTTGGTTTTTCCAGACAGGGTTTCTCTGTATAGCCCTGGCTGTCCTGGAACTCACTTTGTAGACCAGG
is different from the reference sequence(both strand):
CCCTGAGATCAACACTGTCTTCCCACACAAAATGCTCACGCTGCCATTTAATGTCAGGTAAACAGACTTGTACTTAGTAAAAGCTTCGTGGAATTGTTCATCTCTACAGAGGGCAGCCACCAGCAACCTACTGGATCAGGAACCCACGCACCATCAAAGAGGAAAAGCATCCGTGGTAAACACCCGAGGTGATGAACCTGCTCCCAAAGAGCAAAGACAAAAACTAACTCAACCTGCCGCACAGACACACATGCTCGTTCTTTTTTTTTCTTTTTTGGTTTTTCCAGACAGGGTTTCTCTGTATAGCCCTGGCTGTCCTGGAACTCACTTTGTAGACCAGG
CCTGGTCTACAAAGTGAGTTCCAGGACAGCCAGGGCTATACAGAGAAACCCTGTCTGGAAAAACCAAAAAAGAAAAAAAAAGAACGAGCATGTGTGTCTGTGCGGCAGGTTGAGTTAGTTTTTGTCTTTGCTCTTTGGGAGCAGGTTCATCACCTCGGGTGTTTACCACGGATGCTTTTCCTCTTTGATGGTGCGTGGGTTCCTGATCCAGTAGGTTGCTGGTGGCTGCCCTCTGTAGAGATGAACAATTCCACGAAGCTTTTACTAAGTACAAGTCTGTTTACCTGACATTAAATGGCAGCGTGAGCATTTTGTGTGGGAAGACAGTGTTGATCTCAGGG
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence chr13 provided:
CTTCTGCAGAGTCAGCTTCTTTGTCATTATATAAGAGTACAGGCACTCCCCCTCAATTTATAATGGAGTCACACCCGAGATAATCCCACAGAAGTGGAAAACACCCTAAGTTGAAGATGTGTTTTGCCTTTCTCAAACATGCTTGGAGCCTTAGCATGCAGTTTAACAAACCCTCTTAACACAAAGCCAGTTTATAATGAGAGCTCAAATATCTTCTGTAAAGTACAGTGCTGAAGTGAGC
is different from the reference sequence(both strand):
CTTCTGCAGAGTCAGCTTCTTTGTCATTATATAAGAGTACAGGCACTCCCCCTCAATTTATAATGGAGTCACACCCGAGATAATCCCACAGAAGTGGAAAACACCCTAAGTTGAAGATGTGTTTTGCCTTTCTCAAACATGCTTGGAGCCTTAGCATGCAGTTTAACAAACCCTCTTAACACAAAGCCAGTTTATAATGAGAGCTCAAATATCTTCTGTAAAGTACAGTGCTGAAGTGAGC
GCTCACTTCAGCACTGTACTTTACAGAAGATATTTGAGCTCTCATTATAAACTGGCTTTGTGTTAAGAGGGTTTGTTAAACTGCATGCTAAGGCTCCAAGCATGTTTGAGAAAGGCAAAACACATCTTCAACTTAGGGTGTTTTCCACTTCTGTGGGATTATCTCGGGTGTGACTCCATTATAAATTGAGGGGGAGTGCCTGTACTCTTATATAATGACAAAGAAGCTGACTCTGCAGAAG
WARNING @ Wed, 13 Jul 2016 15:43:15:
The amplicon sequence chr10 provided:
CATCTAGCTGGTTCCTCCTTTCATTACTTCAATTCATCCACTTTGTGGTGCCACAAAGGGATTTAAAATGTCACAAAGACCGAGGCCACCAATTCCTTACCCTGTGGAGAGATAGACACTGTAGTCACTCAGGACACATTGGTCTCTTAAAGCAGGTCCTGCACAGTCAGGATGCCACAGCAATGCTAAACACCTGCAGCTGGAGTGTTTCTTGCTCGTTACAGTTCTTGACTGCACTGGATAATGTAAAGGTTGGATAATGAGTTGATCTCCGAACTGTTCTGTGGACCAATGAAACTGTAGCAAGCAG
is different from the reference sequence(both strand):
CATCTAGCTGGTTCCTCCTTTCATTACTTCAATTCATCCACTTTGTGGTGCCACAAAGGGATTTAAAATGTCACAAAGACCGAGGCCACCAATTCCTTACCCTGTGGAGAGATAGACACTGTAGTCACTCAGGACACATTGGTCTCTTAAAGCAGGTCCTGCACAGTCAGGATGCCACAGCAATGCTAAACACCTGCAGCTGGAGTGTTTCTTGCTCGTTACAGTTCTTGACTGCACTGGATAATGTAAAGGTTGGATAATGAGTTGATCTCCGAACTGTTCTGTGGACCAATGAAACTGTAGCAAGCAG
CTGCTTGCTACAGTTTCATTGGTCCACAGAACAGTTCGGAGATCAACTCATTATCCAACCTTTACATTATCCAGTGCAGTCAAGAACTGTAACGAGCAAGAAACACTCCAGCTGCAGGTGTTTAGCATTGCTGTGGCATCCTGACTGTGCAGGACCTGCTTTAAGAGACCAATGTGTCCTGAGTGACTACAGTGTCTATCTCTCCACAGGGTAAGGAATTGGTGGCCTCGGTCTTTGTGACATTTTAAATCCCTTTGTGGCACCACAAAGTGGATGAATTGAAGTAATGAAAGGAGGAACCAGCTAGATG
INFO @ Wed, 13 Jul 2016 15:43:15:
The uncompressed reference fasta file for /gpfs/fs2/scratch/aaiezza/data/ref_genome/mouse/musculus is already present! Skipping generation.
INFO @ Wed, 13 Jul 2016 15:43:15:
Aligning reads to the provided genome index...
INFO @ Wed, 13 Jul 2016 15:43:18:
Demultiplexing reads by location...
gzip: /gpfs/fs2/scratch/aaiezza/amplicon_exp/cspresso/A5_S184/CRISPRessoPooled_on_A5_S184/MAPPED_REGIONS//*.fastq: No such file or directory
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Fmn1
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Fmn1 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Dntt
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Dntt doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Ankrd10
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Ankrd10 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Mt1
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Mt1 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Psmd13
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Psmd13 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:Asap1
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon Asap1 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:chr10_1
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon chr10_1 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:chr14
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon chr14 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:chr13
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon chr13 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Processing amplicon:chr10
WARNING @ Wed, 13 Jul 2016 15:43:18:
The amplicon chr10 doesn't have any read mapped to it!
Please check your amplicon sequence.
INFO @ Wed, 13 Jul 2016 15:43:18:
Reporting problematic regions...
/software/crispresso/b1/lib/python2.7/site-packages/CRISPResso/CRISPRessoPooledCORE.py:771: FutureWarning: convert_objects is deprecated. Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
df_regions=df_regions.convert_objects(convert_numeric=True)
CRITICAL @ Wed, 13 Jul 2016 15:43:18:
ERROR: Cannot set a frame with no defined index and a value that cannot be converted to a Series
~~~CRISPRessoPooled~~~
-Analysis of CRISPR/Cas9 outcomes from POOLED deep sequencing data-
) )
( _______________________ (
__)__ | __ __ __ __ __ | __)__
C\| \ ||__)/ \/ \| |_ | \ | C\| \
\ / || \__/\__/|__|__|__/ | \ /
\___/ |_______________________| \___/
[Luca Pinello 2015, send bugs, suggestions or *green coffee* to lucapinello AT gmail DOT com]
Version 0.9.7
Mapping amplicons to the reference genome...
srun: error: bhc0045: task 0: Exited with exit code 255
srun: Terminating job step 8023737.1
Sorry for the slow response but I am getting married in few days :)
Thanks for the detailed log. It seems that there is a problem with the mapping of the amplicon to the reference genome. Under the hood I use bowtie2 to perform this operation.
Could you please check/try two things:
1) That bowtie2 is installed properly and that the index provided is correct 2) Try to run CRISPResso in "genome only mode" and in "amplicon only mode" to see if the problem are the calls to bowtie2 in general or just the mapping of the amplicons. To do that you can use these two command:
GENOME ONLY MODE:
CRISPRessoPooled \ --fastq_r1 A5_S184_L001_R1_001.fastq.gz \ --fastq_r2 A5_S184_L001_R2_001.fastq.gz \ --bowtie2_index /data/ref_genome/mouse/musculus \ --gene_annotations /data/ref_genome_annot/ucsc/mouse/vMM10.annotation.gz \ --n_processes 4 \ --name A5_S184 \ --trim_sequences \ --max_paired_end_reads_overlap 250 \ --exclude_bp_from_left 10 \ --exclude_bp_from_right 10 \ --output_folder cspresso/A5_S184 \ --save_also_png
AMPLICON ONLY MODE
CRISPRessoPooled \ --fastq_r1 A5_S184_L001_R1_001.fastq.gz \ --fastq_r2 A5_S184_L001_R2_001.fastq.gz \ --amplicons_file amplicons_description.txt \ --gene_annotations /data/ref_genome_annot/ucsc/mouse/vMM10.annotation.gz \ --n_processes 4 \ --name A5_S184 \ --trim_sequences \ --max_paired_end_reads_overlap 250 \ --exclude_bp_from_left 10 \ --exclude_bp_from_right 10 \ --output_folder cspresso/A5_S184 \ --save_also_png
Thanks and I hope we can solve this quickly!
Hey congratulations!
I have tried running in with just amplicons and it seems to work great, however without the annotation file given. (Though I'm not sure how to get the nice graphs you get when using the web UI)
I've tried running also just against a bowtie 2 index genome and I still get the same issue of the amplicons not mapping to the reference. My guess is maybe my amplicon description file might be messed up:
"Fmn1_(chr2:-113435153,_intronic),_score_=0.3_4MM_[1:3:11:19],_306bp" GTCTTTGAGTTTGGGCAGAATTTCTAAACTATATCCGTCTGCTTGCCCTCGCGTCCTGTTTCCTCCATTCATGTAGTTTCCAAAAGCCAGAATGAGAGCTAAGATATCCTTCACACTCTTCATGTGCAACAAGCCCTGCAGTGGCAAACACCAGCAGTTATGGTCTGAACTACAAACACAGTCATATTCGCTCGCTCCACAGCTAACCCTCATTAAAGTAACCAAATCCTGGTAAGTGGCTTGTGATTAGTTGTATCAACAGTTGGCAAATACAAGATACATTTCACTACAGCAGTATCATGTGGG ACAAGCCCTGCAGTGGCAAACACCAGCAGTTA NA NA
"Dntt_(chr19:+41130149,_intronic),_score=0.2_4MM_[9:10:11:20],_326bp" TAGAAACAACGCTCCTACTGTCCATTTATCCACTCAACAGATATTCACACATCACCTGCCGCCTGTTGGGCAGTGATTGACAAGGGTCCAAGCCAATCAGCTTCTTGTTCACATTGAGTTTCTGTTCTAGTACAGGGAGCCCAGAGCTCAGCCACCCGGGAGCTTTGCCCTGAGGAAGAAAGTCACCCCAAAAATTTATGTTAGAGACAGCAGTTTCAAACACCCAAGGGCTTTGGATAGCTCTAAACACCGTGTACCCGAAATAATCTGGACTAGACGGTAATTTGTTTTAATTCTCTTTGTAGCAGTTTGAGAGAGACTTGCGG ACAGCAGTTTCAAACACCCA NA NA
"Ankrd10_(chr8:+11614650,_intronic),_score=0.2_4MM_[5:7:11:19],_220bp" TGTGCACAGCGTGCTATTTCACACTAGGAATTGGCAAGAATCTCTAGGGAGTGCCAACACGTTTCCTGAGTCAGACAGTACTGGAAAACACCAGAAGGCCCACGTGGCCTTCTGACATCCAGGACGACCTGCCTGCTGGCATGAGAAGAGCGAAGAGCTTCTTTCTCCTGCCTGACCAGGAAGGGAACAATGCTGTCTCCATAAAGGAGAGGCTCTGGCT ACAGTACTGGAAAACACCAGA NA NA
"Mt1_(chr8:-96703655,_intronic),_score=0.1_4MM_[11:12:19:20],_330bp" GGACATTTCTCAGAGCCAGTTTTGTAGGAGTTCCCCGCCCCTAGCCTTAGCCGCCACCCAAGGTGTCCCAACTCACTCTTCTTGCAGGAGGTGCACTTGCAGTTCTTGCAGGCGCAGGAGCTGGTGCAAGTGCAGGAGCCGCCTGGGGAGGAGAAAGAAGACAGCATGAGGGAGGCAGCATTACAGCAGTGGCCAACACCACGAGTCCCGGCTCAGTTCACTAAGTCCTCCTCGGAGCTGCAGGGAGCCTAGCCCCACTTTTCTCCTCACAGGTTAAGTCAGGGATTATGTCTTTGAGTCCCAAGACATAAAGGTCCTTCACCTCTTTCT ACAGCAGTGGCCAACACCACGAGTCC NA NA
"Psmd13_(chr7:-148083641,_intronic),_score=0.0_4MM_[7:16:18:20],_371bp" CATGATAGGGCATGGGATCAAGTCACAAAACCAGGAACACGTCTGCTGGAGCAGCAATTTCAGGATTAGGAGGCATCAGCAGCGCAGCCAGCCTGGAAGTGCAGGGCGCAGACCTCAGAGGGCTGCTTTGCTAGGCCTCCACAGCAGTAATCCCACCTTGTGTGCGACAGCGTCCGCTTCCTAAATGGCTTCTGTCCACATAAAACTAGGACAGCATTGGTAAACCCGCAAAGGGCAGGGGCCCAGCCCCCTTACCTGCTGCAAATCCAGCACTCGCGGCTGCACCCACGTCATGTGAACCCGCTTGTCCACCTCGTCTATGCTGCCTCTCACCAGCCCCACCGAGAGTGCCTTCATCACCAGCAACTCCA ACAGCATTGGTAAACCCGCAA NA NA
"Asap1_chr15:+64146590 Score=0.6_3MMs_[1:17:20],_305bp" GTTTGGGTGGCATGAGTTTATCAAATAAGAGGGTAAGGCGTGTCAAAAATGACCACCACACAGAGCCCTCCCAGCTCCAGCCGGTTTGCTCCTGGCTCTGCAGATGAACGAGTCAAGATTATTCCAAGCTCAGCAGTGGTAAACAGCCAGGGATTTCTTTCTGAATTCACCGGAGAGCCCAAACTGCCGCCACCAATTGCTGTTTCAGTTTCCTCCGAAGATTATGTTATGTATCTGCCCCTCCCCTGCCTCCTCCAGCCAAGAGGGGACTATATGAACAATGAGATATTGTGCTCTGGTAAGCA TCAGCAGTGGTAAACAGCCA NA NA
"chr10:+121100989_Score=1.5_3MMs_[4:8:9],_292bp" GATGGAGGATGGGAAGAACAATAATTAGAGGGCCACGGTCACGGGATGCGCACAGGCAGAGCTCCTCAGCGCCTCTCAGATGTGAGGCCGAAGCCTAATTATGAAAAGCTGCTGGGTCGGAAGACAGAGGCTGCTGTCTTGGGACATCAGATGCATAAGTGAGATTACTTTTCAGGATAGTGATAAACAACAGGCGTAAACACCCGAGGGAGGGATGGAAAACAGACTCGTGGTCTCTGATGAGGAGATCAGTACCCAGGTTTCGCTCTCCTTAGGGTGACTTCATCAGTGG ACAACAGGCGTAAACACCCG NA NA
"chr14:+79726708__Score=1.5_3MMs_[1:4:6],_341bp" CCCTGAGATCAACACTGTCTTCCCACACAAAATGCTCACGCTGCCATTTAATGTCAGGTAAACAGACTTGTACTTAGTAAAAGCTTCGTGGAATTGTTCATCTCTACAGAGGGCAGCCACCAGCAACCTACTGGATCAGGAACCCACGCACCATCAAAGAGGAAAAGCATCCGTGGTAAACACCCGAGGTGATGAACCTGCTCCCAAAGAGCAAAGACAAAAACTAACTCAACCTGCCGCACAGACACACATGCTCGTTCTTTTTTTTTCTTTTTTGGTTTTTCCAGACAGGGTTTCTCTGTATAGCCCTGGCTGTCCTGGAACTCACTTTGTAGACCAGG GCATCCGTGGTAAACACCCG NA NA
"chr13:-81283131__Score=0.8_3MMs_[5:11:20],_241bp" CTTCTGCAGAGTCAGCTTCTTTGTCATTATATAAGAGTACAGGCACTCCCCCTCAATTTATAATGGAGTCACACCCGAGATAATCCCACAGAAGTGGAAAACACCCTAAGTTGAAGATGTGTTTTGCCTTTCTCAAACATGCTTGGAGCCTTAGCATGCAGTTTAACAAACCCTCTTAACACAAAGCCAGTTTATAATGAGAGCTCAAATATCTTCTGTAAAGTACAGTGCTGAAGTGAGC NA NA NA
"chr10:+83240905__Score=0.6_3MMs_[7:10:19],_310bp" CATCTAGCTGGTTCCTCCTTTCATTACTTCAATTCATCCACTTTGTGGTGCCACAAAGGGATTTAAAATGTCACAAAGACCGAGGCCACCAATTCCTTACCCTGTGGAGAGATAGACACTGTAGTCACTCAGGACACATTGGTCTCTTAAAGCAGGTCCTGCACAGTCAGGATGCCACAGCAATGCTAAACACCTGCAGCTGGAGTGTTTCTTGCTCGTTACAGTTCTTGACTGCACTGGATAATGTAAAGGTTGGATAATGAGTTGATCTCCGAACTGTTCTGTGGACCAATGAAACTGTAGCAAGCAG NA NA NA
I wasn't incredibly certain what to put for the sgRNA for each amplicon and some are just NA as directed in the manual.
Obviously with the excitement of your wedding approaching, please take your time in responding to enjoy it!
So I actually figured out my issue. I'm working on a NAS share that's cifs mounted. So running it locally works fine. Otherwise my major error is that the file handle opener is unhappy about the mounted share and gives the following message at runtime:
/bin/sh: /cvri/miano/amplicon_exp/cspresso/A5_S184/CRISPRessoPooled_on_A5_S184/Miano1-A5_S184_L001.assembled.fastq: No such file or
directory
It doesn't see it for some reason and then the "demultiplexing" portion of the code resumes anyways and every amplicon returns the same message:
gzip: stdin: unexpected end of file
Very informative troubleshooting. Sadly just one more reason I dislike python so.
Strangely enough, this error doesn't happen with the --trim-sequences
flag set. It DOES occur afterward however but on the demultiplexed gzipped files.
This leads me to believe the error may exist in both the CRISPRessoCORE.py and CRISPRessoPooledCORE.py files.
This is only inconvenient for now, but at least it works! And it does work well on the local file system.
Hi Alex, thanks for the detailed analysis. This was hard to catch/debug.
I have tried before to use crispresso on a network drive but I never had this problem, probably since our network drive is fast enough (but we use nfs and not cifs)
For the first error I don't know why it cannot open properly the file, since I am using the standard call to open a file.
For the second error, the relevant code for the demultiplexing is here:
#align in unbiased way the reads to the genome
if RUNNING_MODE=='ONLY_GENOME' or RUNNING_MODE=='AMPLICONS_AND_GENOME':
info('Aligning reads to the provided genome index...')
bam_filename_genome = _jp('%s_GENOME_ALIGNED.bam' % database_id)
aligner_command= 'bowtie2 -x %s -p %s -k 1 --end-to-end -N 0 --np 0 -U %s 2>>%s| samtools view -bS - > %s' %(args.bowtie2_index,args.n_processes,processed_output_filename,log_filename,bam_filename_genome)
sb.call(aligner_command,shell=True)
N_READS_ALIGNED=get_n_aligned_bam(bam_filename_genome)
#REDISCOVER LOCATIONS and DEMULTIPLEX READS
MAPPED_REGIONS=_jp('MAPPED_REGIONS/')
if not os.path.exists(MAPPED_REGIONS):
os.mkdir(MAPPED_REGIONS)
s1=r'''samtools view -F 0x0004 %s 2>>%s |''' % (bam_filename_genome,log_filename)+\
r'''awk '{OFS="\t"; bpstart=$4; bpend=bpstart; split ($6,a,"[MIDNSHP]"); n=0;\
for (i=1; i<=length(a); i++){\
n+=1+length(a[i]);\
if (substr($6,n,1)=="S"){\
if (bpend==$4)\
bpstart-=a[i];\
else
bpend+=a[i];
}\
else if( (substr($6,n,1)!="I") && (substr($6,n,1)!="H") )\
bpend+=a[i];\
}\
if ( ($2 % 32)>=16)\
print $3,bpstart,bpend,"-",$1,$10,$11;\
else\
print $3,bpstart,bpend,"+",$1,$10,$11;}' | '''
s2=r''' sort -k1,1 -k2,2n | awk \
'BEGIN{chr_id="NA";bpstart=-1;bpend=-1; fastq_filename="NA"}\
{ if ( (chr_id!=$1) || (bpstart!=$2) || (bpend!=$3) )\
{\
if (fastq_filename!="NA") {close(fastq_filename); system("gzip "fastq_filename)}\
chr_id=$1; bpstart=$2; bpend=$3;\
fastq_filename=sprintf("__OUTPUTPATH__REGION_%s_%s_%s.fastq",$1,$2,$3);\
}\
print "@"$5"\n"$6"\n+\n"$7 >> fastq_filename;\
}' '''
cmd=s1+s2.replace('__OUTPUTPATH__',MAPPED_REGIONS)
info('Demultiplexing reads by location...')
sb.call(cmd,shell=True)
#gzip the missing ones
sb.call('gzip %s/*.fastq' % MAPPED_REGIONS,shell=True)
As you can see the final call is compressing demultiplexed reads by location, but since the network drive is probably too slow in your case it may be not synced by the time I am trying to compress the files in the last line or to use it later. One monkey patch to cover this case would be to add a delay command before and after the last line to allow the cifs share to be in sync.
For example this will wait for 60 secs:
time.sleep( 60)
Hope this is helpful.
I'm running CRISPRessoPooled in mixed-mode with the following command:
This leads to the following output:
At this point the program stops executing. I found that if you alter CRISPRessoPooledCORE.py at 771 and 801 to
df_regions=df_regions.apply(pd.to_numeric, errors='ignore')
this problem goes away yielding these new results:There is still an error, but it continues to run this time even though all that was fixed was a deprecation. Not sure really if that is a good thing or not...