Closed klychuk closed 4 years ago
I guess there is a duplication of sgRNA name in your library. Can you share the fasta file if you don’t mind? Thank you,
Hyun-Hwan Jeong
Yes, I attached the file. github did not support .fasta so I added .txt to the end but it is still formatted as a fasta
Thanks for sharing it, and I found that there are some guide names like that, and the names cause the problem.
>2020-09-10 00:00:00_1
CGCGACCATGGCCTCCTCCG
>2020-09-10 00:00:00_10
GCTTAATCTTCAGTTCTTCT
>2020-09-10 00:00:00_2
CGCCACCTCGGAGGAGGCCA
>2020-09-10 00:00:00_3
GTGCCGCGCCACCTCGGAGG
>2020-09-10 00:00:00_4
CACCAGGTGCCGCGCCACCT
>2020-09-10 00:00:00_5
ATTCGTTCGTTGACTATGTC
>2020-09-10 00:00:00_6
TGCTTTAATATTCTCTGTGT
>2020-09-10 00:00:00_7
AAATCCCACTGTATTCACAA
>2020-09-10 00:00:00_8
ACCAAATAAATAAAGAAGAG
>2020-09-10 00:00:00_9
CTTCAGTTCTTCTTGGAGAT
>2020-09-11 00:00:00_1
TGCCGCAGCTGCGATGGCCG
>2020-09-11 00:00:00_10
TGTTTCAACATCCTTTGTGT
>2020-09-11 00:00:00_2
AGCTGCGATGGCCGTGGCCG
>2020-09-11 00:00:00_3
CTTCGAAACTTGTCTTTGTC
>2020-09-11 00:00:00_4
CTTGTCTTTGTCTGGCCATG
>2020-09-11 00:00:00_5
GGAGGCTGTCAAATCCCACA
>2020-09-11 00:00:00_6
TGACAGCCTCCCTGACCAGC
>2020-09-11 00:00:00_7
TGTTGACCAGCTGGTCAGGG
>2020-09-11 00:00:00_8
AAGTAGACTTGTTGACCAGC
>2020-09-11 00:00:00_9
GTCAACAAGTCTACTTCTCA
>2020-09-12 00:00:00_1
TGCGAGGACAGGCAGGGAGA
>2020-09-12 00:00:00_10
TGAGTTCAACATCATGGTGG
>2020-09-12 00:00:00_2
GAGGGCTGCGAGGACAGGCA
>2020-09-12 00:00:00_3
GGCTGGAGGGCTGCGAGGAC
>2020-09-12 00:00:00_4
GGACCAAGCATCTCGCAGGG
>2020-09-12 00:00:00_5
TGCGAGATGCTTGGTCCTGT
>2020-09-12 00:00:00_6
TGTGGGCATTGAGGCTGTGC
>2020-09-12 00:00:00_7
GCTGGACCAGCTGAAGATCA
>2020-09-12 00:00:00_8
TCATAGCCTTGATCTTCAGC
>2020-09-12 00:00:00_9
AAGATCAAGGCTATGAAGAT
>2020-09-14 00:00:00_1
TAGCATGGCAGAAAGAACAA
>2020-09-14 00:00:00_10
CTACATAGATGCCCAATTTG
>2020-09-14 00:00:00_2
ATTCGTTGTTTAACTACGAT
>2020-09-14 00:00:00_3
TGAATGTTTGCCCAATCAGT
>2020-09-14 00:00:00_4
AGATCTGCTCACCAACTGAT
>2020-09-14 00:00:00_5
GTGAGCAGATCTATCCGACA
>2020-09-14 00:00:00_6
TCAGTTGAAATTGACTGTTG
>2020-09-14 00:00:00_7
TGACTGTTGTGGAGACAGTA
>2020-09-14 00:00:00_8
GTTGTGGAGACAGTAGGGTA
>2020-09-14 00:00:00_9
ATCAAATAGACAAAGAAGCC
>2020-09-01 00:00:00_1
TCCATCATCGTGGTGAGACA
>2020-09-01 00:00:00_10
CCATAGGACAAGGAGTACGT
>2020-09-01 00:00:00_2
CACGATGATGGAGCTACAGT
>2020-09-01 00:00:00_3
GATGGAGCTACAGTGGGACT
>2020-09-01 00:00:00_4
CTTGGAATCCAGATGTGTGA
>2020-09-01 00:00:00_5
CCAGATGTGTGAAGGATGGA
>2020-09-01 00:00:00_6
TGTGAAGGATGGAGGGTTGA
>2020-09-01 00:00:00_7
AGAGACGGCAGGTGCAGTGA
>2020-09-01 00:00:00_8
GCAGGTGCAGTGATGGCTGG
>2020-09-01 00:00:00_9
AGTGATGGCTGGCGGAGTCA
>2020-09-03 00:00:00_1
AAAGGAGGATTCATGTCCAA
>2020-09-03 00:00:00_10
TTCATGGGCACCGCTGGCTT
>2020-09-03 00:00:00_2
CTGCAGGGCTCCCAGAGACC
>2020-09-03 00:00:00_3
CTGCGTCCGTCCTGGTCTCT
>2020-09-03 00:00:00_4
TGACATGGCTGCGTCCGTCC
>2020-09-03 00:00:00_5
GGACGCAGCCATGTCAGAGC
>2020-09-03 00:00:00_6
CTCAGGCACCAGCTCTGACA
>2020-09-03 00:00:00_7
CAGAGCTGGTGCCTGAGCCC
>2020-09-03 00:00:00_8
TGAGCCCAGGCCTAAGCCAG
>2020-09-03 00:00:00_9
GGGCACCGCTGGCTTAGGCC
>2020-09-04 00:00:00_1
GACTTTACCCTCATGGTGGC
>2020-09-04 00:00:00_10
AGAGAGGATCATGCAAACTG
>2020-09-04 00:00:00_2
TCTCTCCTCTCAGGAGAGTC
>2020-09-04 00:00:00_3
CTCTCAGGAGAGTCTGGCCT
>2020-09-04 00:00:00_4
TGACAAGTGTGGATTTGCCC
>2020-09-04 00:00:00_5
GAAGAGGCTATTGACAAGTG
>2020-09-04 00:00:00_6
CTTCCTCACTGATCTGTACC
>2020-09-04 00:00:00_7
TCACTGATCTGTACCGGGAC
>2020-09-04 00:00:00_8
CACCAAGAAGTTTCCGGTCC
>2020-09-04 00:00:00_9
CGGAAACTTCTTGGTGCTGA
>2020-09-05 00:00:00_1
CGTACTGCTTGTCAATGTCC
>2020-09-05 00:00:00_10
GTGTGCAGGTGAGTCAGGCC
>2020-09-05 00:00:00_2
CTTCGCCACACTGCCCAACC
>2020-09-05 00:00:00_3
GTGCACCTGGTTGGGCAGTG
>2020-09-05 00:00:00_4
GACTTGCGGTGCACCTGGTT
>2020-09-05 00:00:00_5
TCACCGACTTGCGGTGCACC
>2020-09-05 00:00:00_6
CACCGCAAGTCGGTGAAGAA
>2020-09-05 00:00:00_7
AGGCTTTGACTTCACACTCA
>2020-09-05 00:00:00_8
GACTTCACACTCATGGTGGC
>2020-09-05 00:00:00_9
TACCTGTGTGCAGGTGAGTC
>2020-09-06 00:00:00_1
AGCGACCGATATAGCTCGCC
>2020-09-06 00:00:00_10
TGCTTCAACATCCTGTGCGT
>2020-09-06 00:00:00_2
TACCACCTGGCGAGCTATAT
>2020-09-06 00:00:00_3
CTCCTTCCAAATTAGGGTGA
>2020-09-06 00:00:00_4
TGACAGCTTGCCTGACCAGC
>2020-09-06 00:00:00_5
GACTTATTCACCAGCTGGTC
>2020-09-06 00:00:00_6
TGACGGACTTATTCACCAGC
>2020-09-06 00:00:00_7
GTGAATAAGTCCGTCAGCCA
>2020-09-06 00:00:00_8
GAAGCAGAAGCCCTGGCTGA
>2020-09-06 00:00:00_9
GGATGTTGAAGCAGAAGCCC
>2020-09-08 00:00:00_1
CAACACGACCTTCGAGACTG
>2020-09-08 00:00:00_10
TTGTGGATGCCGTGGGCTTT
>2020-09-08 00:00:00_2
ACTGGCTTCCTCAGTCTCGA
>2020-09-08 00:00:00_3
TGAGGAAGCCAGTCACCATG
>2020-09-08 00:00:00_4
CACGCATGCCTCATGGTGAC
>2020-09-08 00:00:00_5
ATGAGGCATGCGTGCGCCTG
>2020-09-08 00:00:00_6
TCTCCTGGAGGTCATAGGTC
>2020-09-08 00:00:00_7
TGAGCTGCACGTTGCTCTCC
>2020-09-08 00:00:00_8
GCAGCTCAAGCTGACCATTG
>2020-09-08 00:00:00_9
CTGACCATTGTGGATGCCGT
>2020-09-09 00:00:00_1
TGCTTGAGCCCGGCATCTCT
>2020-09-09 00:00:00_10
TTACCCAAGCCGCTCTGCCC
>2020-09-09 00:00:00_2
TGCAGGCGCCTGCTTGAGCC
>2020-09-09 00:00:00_3
TCAAGCAGGCGCCTGCATCA
>2020-09-09 00:00:00_4
GCCTGCATCACGGAACGAGA
>2020-09-09 00:00:00_5
CCACGTAGCCGAAGTCCACC
>2020-09-09 00:00:00_6
CCATCCTGGAGCAGATGCGC
>2020-09-09 00:00:00_7
GCGCCGGAAGGCCATGAAGC
>2020-09-09 00:00:00_8
GAACTCGAAGCCCTGCTTCA
>2020-09-09 00:00:00_9
GGGCTTCGAGTTCAACATCA
The main problem is that space between date and time faces you the duplication problem. The easiest solution is replacing a space between the date and the time with a character like -
.
I believe these genes should be named SEPTX (where X is a number), and the name was what you expected to see. I guess you or the original data provider used Microsoft Excel during the data processing, and Excel converts but Excel forces to convert the name to date. Let me know if you think it is a problem and need my help.
Thank you,
Hyun-Hwan Jeong
Thank you so much!
Hello,
I have previously run CB2 successfully and enjoyed the methods as well as documentation. When I went to run it on a different experiment I received this error. I thought it may have to do with my library construction but I used a python dictionary to populate the .fasta file so each value should be unique. The row names don't look like names and I'm not sure where the issue is coming from.
thanks,
Karson