samandmac / PhyloBuild

This is a pipeline that can be used to generate a phylogenetic tree, including a heatmap showing carriage of specific genes, the assigned phylogroup of each strain, and the names of each strain. It should now be capable of working with any sort of strains - currently I'm working on making small improvements.
3 stars 0 forks source link

Possible issues with PhyloGenes on macOS #8

Open samandmac opened 1 year ago

samandmac commented 1 year ago

So I've encountered two problems with users who use macOS:

To bypass these issues I recommend these solutions, respectively:

blastn -query /Users/you/Documents/PhyloTree-main/PhyloGenes/GeneListX.txt -subject /Users/you/Documents/PhyloTree-main/Tree_Genomes/$W.fasta -qcov_hsp_perc 80 -perc_identity 70 -outfmt "6 qseqid pident" | gsed 's/^\(.\{0\}\)/\1>/' | awk '!seen[$0]++' | sort -k 2n  > Z_Last_File.txt #X in GeneListX.txt should be the number of genomes. E.g. if you have 88 strains, you should use GeneList88.txt

head Z_Last_File.txt | awk '{print $1}' > Z_Ortho_Names.txt

grep -Fwf Z_Ortho_Names.txt singleLineGenes.txt | gsed 's/[[:blank:]]*\([^[:blank:]]*\)$/\n\1/' > Z_Orthologs.txt

gsed '/^>/ s/^.*gene=\([Aa-Zz]\+\).*/\1/' Z_Orthologs.txt | gsed '1~2s/^/>/' > /Users/you/Documents/PhyloTree-main/Tree_Genes/geneList.txt

Remember to swap your path with the path used in the above example.

I'd appreciate any input from Mac user's as to why the comparison in the IF statement in PhyloGenes.sh isn't working, and workarounds for the SED, given that SED differs between Linux and Mac.

samandmac commented 1 year ago

I've updated the code of PhyloGenes in patch-1 to bypass the issue of it ignoring the check in the loop. I'm currently trying to see if this works on a Mac now - if it does the problem is solved.

samandmac commented 1 year ago

Based on conversations with Mac User's the patch1 wasnt working - for now continue with the workaround above until I fix this