maximilianh / crisporWebsite

All source code of the crispor.org website
http://crispor.org
Other
68 stars 43 forks source link

[E::bwa_idx_load] fail to locate the index files on custom built genome #10

Closed elijahlowe closed 6 years ago

elijahlowe commented 6 years ago

Sorry to bother you again. I built a custom genome using the following command and received this output:

sudo ../../tools/crisprAddGenome fasta JoinedScaffold.fasta --desc 'cionaRobustaKH|Ciona robusta|C. robusta|Ghost Joined Scaffolds' --gff KH.KHGene.2012.gff3

 ==== /tmp2/cionaRobustaKH/cionaRobustaKH.sizes exists - not indexing with BWA and not converting to twobit ==== 
moving /tmp2/cionaRobustaKH/cionaRobustaKH.gp to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH
moving /tmp2/cionaRobustaKH/cionaRobustaKH.segments.bed to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH
moving /tmp2/cionaRobustaKH/cionaRobustaKH.sizes to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH
moving /tmp2/cionaRobustaKH/cionaRobustaKH.2bit to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH
wrote /var/www/crispor/genomes/cionaRobustaKH/genomeInfo.tab

I then tried to run crispor and got the a message saying cannot locate index.

~/bioinformatics_software/crisporWebsite/crispor.py cionaRobustaKH KH.test.fasta testOutput.tsv -o offtaget.tsv
INFO:root: * running on sequence 'KH2012:KH.C1.1.v1.A.ND1-1', guideLen=20, seqLen=995
[E::bwa_idx_load] fail to locate the index files
ERROR:root:Error: could not run command set -o pipefail; /Users/elowe3/bioinformatics_software/crisporWebsite/bin/Darwin/bwa bwasw -T 20 /Users/elowe3/bioinformatics_software/crisporWebsite/genomes/cionaRobustaKH/cionaRobustaKH.fa /var/folders/bz/lbll3mbn29v8bpf5w2hfdfyngydffc/T/crisporBestMatchqQEQg7.fa > /var/folders/bz/lbll3mbn29v8bpf5w2hfdfyngydffc/T/crisporBestMatchXoRay1.sam.

I thought it was strange that I received the message "not indexing with BWA" when I created the genome, so I ran bwa index and linked the index files to all places I thought could solve the problem:

ls /Users/elowe3/bioinformatics_software/crisporWebsite/genomes/cionaRobustaKH/
JoinedScaffold.fasta    cionaRobustaKH.fa.amb   cionaRobustaKH.fa.bwt   cionaRobustaKH.fa.sa
KH.KHGene.2012.gff3 cionaRobustaKH.fa.ann   cionaRobustaKH.fa.pac

ls /tmp2/cionaRobustaKH/
cionaRobustaKH.fa   cionaRobustaKH.fa.ann   cionaRobustaKH.fa.pac
cionaRobustaKH.fa.amb   cionaRobustaKH.fa.bwt   cionaRobustaKH.fa.sa

 ls /var/www/crispor/genomes/cionaRobustaKH/
cionaRobustaKH.2bit     cionaRobustaKH.fa.pac       cionaRobustaKH.sizes
cionaRobustaKH.fa.amb       cionaRobustaKH.fa.sa        genomeInfo.tab
cionaRobustaKH.fa.ann       cionaRobustaKH.gp
cionaRobustaKH.fa.bwt       cionaRobustaKH.segments.bed

Now I'm stuck...

maximilianh commented 6 years ago

you've saved the genome into /var/www/crispor/genomes/ but your crispor program is in ~/bioinformatics_software/crisporWebsite/crispor.py. How should crispor.py be able to guess that the genome is under /var/www ? You need to either move "genomes" into the same directory where crispor.py is located or you have to specify the genomes directory with -g.

But may I ask why are you doing this at all? I have Ciona robusta already on the website. Also, if this is a different genome, you can simply send me the fasta and I can add it to the website.

On Thu, Mar 8, 2018 at 9:56 PM, Elijah Lowe notifications@github.com wrote:

Sorry to bother you again. I built a custom genome using the following command and received this output:

sudo ../../tools/crisprAddGenome fasta JoinedScaffold.fasta --desc 'cionaRobustaKH|Ciona robusta|C. robusta|Ghost Joined Scaffolds' --gff KH.KHGene.2012.gff3

==== /tmp2/cionaRobustaKH/cionaRobustaKH.sizes exists - not indexing with BWA and not converting to twobit ==== moving /tmp2/cionaRobustaKH/cionaRobustaKH.gp to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH moving /tmp2/cionaRobustaKH/cionaRobustaKH.segments.bed to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH moving /tmp2/cionaRobustaKH/cionaRobustaKH.sizes to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH moving /tmp2/cionaRobustaKH/cionaRobustaKH.2bit to crispr genome dir /var/www/crispor/genomes/cionaRobustaKH wrote /var/www/crispor/genomes/cionaRobustaKH/genomeInfo.tab

I then tried to run crispor and got the a message saying cannot locate index.

~/bioinformatics_software/crisporWebsite/crispor.py cionaRobustaKH KH.test.fasta testOutput.tsv -o offtaget.tsv INFO:root: * running on sequence 'KH2012:KH.C1.1.v1.A.ND1-1', guideLen=20, seqLen=995 [E::bwa_idx_load] fail to locate the index files ERROR:root:Error: could not run command set -o pipefail; /Users/elowe3/bioinformatics_software/crisporWebsite/bin/Darwin/bwa bwasw -T 20 /Users/elowe3/bioinformatics_software/crisporWebsite/genomes/cionaRobustaKH/cionaRobustaKH.fa /var/folders/bz/lbll3mbn29v8bpf5w2hfdfyngydffc/T/crisporBestMatchqQEQg7.fa > /var/folders/bz/lbll3mbn29v8bpf5w2hfdfyngydffc/T/crisporBestMatchXoRay1.sam.

I thought it was strange that I received the message "not indexing with BWA" when I created the genome, so I ran bwa index and linked the index files to all places I thought could solve the problem:

ls /Users/elowe3/bioinformatics_software/crisporWebsite/genomes/cionaRobustaKH/ JoinedScaffold.fasta cionaRobustaKH.fa.amb cionaRobustaKH.fa.bwt cionaRobustaKH.fa.sa KH.KHGene.2012.gff3 cionaRobustaKH.fa.ann cionaRobustaKH.fa.pac

ls /tmp2/cionaRobustaKH/ cionaRobustaKH.fa cionaRobustaKH.fa.ann cionaRobustaKH.fa.pac cionaRobustaKH.fa.amb cionaRobustaKH.fa.bwt cionaRobustaKH.fa.sa

ls /var/www/crispor/genomes/cionaRobustaKH/ cionaRobustaKH.2bit cionaRobustaKH.fa.pac cionaRobustaKH.sizes cionaRobustaKH.fa.amb cionaRobustaKH.fa.sa genomeInfo.tab cionaRobustaKH.fa.ann cionaRobustaKH.gp cionaRobustaKH.fa.bwt cionaRobustaKH.segments.bed

Now I'm stuck...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maximilianh/crisporWebsite/issues/10, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TSDZ4RMAaqiI36GmGc3UJjHUJcOTks5tcakZgaJpZM4SjaXX .

elijahlowe commented 6 years ago

Thank you that fixed the problem. I didn't place them there myself. That is where crisprAddGenome saved them.

I am doing this because I want the .tsv file for the entire genome, so I can build a database and retrieve gRNAs using gene IDs. I also have several other genomes that I will have to do the same process with, and at this time I do not have permission to share them.

maximilianh commented 6 years ago

Oh you want to annotate the whole genome? Wow. That may take a while. Let me know how it goes. I've used crispor that way but without a big compute cluster, you may not get very far...

On Fri, Mar 9, 2018 at 2:41 PM, Elijah Lowe notifications@github.com wrote:

Thank you that fixed the problem. I didn't place them there myself. That is where crisprAddGenome saved them.

I am doing this because I want the .tsv file for the entire genome, so I can build a database and retrieve gRNAs using gene IDs. I also have several other genomes that I will have to do the same process with, and at this time I do not have permission to share them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/maximilianh/crisporWebsite/issues/10#issuecomment-371830914, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TXmgLRdphoa-J3MeECAW1tyFgd2Hks5tcpSrgaJpZM4SjaXX .