Open nicolashazzi23 opened 9 months ago
You will likely need to check the log files output by samtools/bamtools. The image attached is somewhat hard to see, but it does not look like some files are being created properly. Once you have fasta files, they can both go in the same folder. Extracting SNPs is up to you - e.g. depending on your needs, you can choose where to harvest the SNP calls from... or you can choose to additionally alter/update the files produced to output SNPs.
Hi Brant thank you very much for your help. This is the tail of the last 50 lines of the run that shows the error. It seems that is a memory capacity error but I did the run with the maximum capacity of our slurm cluster: "highMem –Nodes in this category have large memory – 3tb and are for jobs that require more memory intensive jobs", and still I got this error
thanks!
Finished processing comp53472_c0_seq1:1-252
Processing comp53486_c0_seq1:1-333
bam bams/HW_0458.0.bam: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
at java.base/java.lang.StringBuilder.append(StringBuilder.java:174)
at htsjdk.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:139)
at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:94)
at htsjdk.samtools.BAMFileReader.readHeader(BAMFileReader.java:667)
at htsjdk.samtools.BAMFileReader.
Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message
Looks like you need to feed pilon some more RAM (running it on a large node is usually not quite enough). This should only require modifying the workflow script in the pilon section here to read like:
'pilon -jar -Xmx256G --threads {threads}...
Where you'll change the 256G
to something that works for your HPC. This sets the max RAM pilon can use (by default it is 1 G).
Hi Brant, thank you very much! it work thanks to your suggestion!
Excellent 👍
Hi Brant sorry for bothering you again but I would like to kindly ask you again about the fasta 0 and 1 files generated by the phasing process. I want to estimate species trees and also get SNPs for a Structure analysis. When I put the 0.fasta and the 1.fasta files in the same folder as you suggested, and ran the function phyluce_assembly_match_contigs_to_probes function, and I got the following error "sqlite3.OperationalError: duplicate column name: HW_0302". Therefore, I should merge the 0.fasta and 1.fasta files using the cat function? or what should I do after the phasing with the 0.fasta and 1.fasta files? thanks in advance!
Hi, I am trying to run the phasing workflow with the bams files generated previously with the mapping workflow. However I am not getting the fasta files in the results, I just got the bam files (see attached image). I am attaching the conf file if that can give some help in clarifying if I am doing something wrong. My final question is regarding how to construct the final snp matrix. Because the tutorial say this after phasing "You can essentially group all the .0.fasta and .1.fasta files for all taxa together as new “assemblies” of data and start the phyluce analysis process over from phyluce_assembly_match_contigs_to_probes.". But I find this kind of ambiguous, I should create a contig folder with the 0.fasta files and a second folder with the 1.fasta files? or a folder with both files together? and How at the end I can put out the snps?
this is the command that I ran
phyluce_workflow --config bams_2.conf \ --output phasing_3 \ --workflow phasing \ --cores 1
phasing.txt