JLSteenwyk / ClipKIT

a multiple sequence alignment-trimming algorithm for accurate phylogenomic inference
https://jlsteenwyk.com/ClipKIT/
MIT License
63 stars 4 forks source link

Cannot read in fasta files #5

Closed LifeMine-swyka closed 4 years ago

LifeMine-swyka commented 4 years ago

Hello,

I used pip install clipkit to install and am using clipkit v0.1.

I was initially getting an error that the input file cannot be read, so I altered all files to chmod 777 just to be sure that wasn't an issue.

(ref_creation) swyka@taxon-pipeline:~/references/2020-09-10_updates$ ls -al
total 276
drwxrwxr-x  5 swyka swyka   4096 Sep 14 13:49 .
drwxrwxr-x  3 swyka swyka   4096 Sep 10 11:27 ..
-rwxrwxrwx  1 swyka swyka 129481 Sep 14 13:49 EOG09261EM7.fa
-rwxrwxrwx  1 swyka swyka 129481 Sep 14 13:47 EOG09261EM7.fasta

However, when running I am still getting the same error.

(ref_creation) swyka@taxon-pipeline:~/references/2020-09-10_updates$ clipkit EOG09261EM7.fasta -o EOG09261EM7_trimmed.fasta
Traceback (most recent call last):
  File "/home/swyka/miniconda3/envs/ref_creation/bin/clipkit", line 8, in <module>
    sys.exit(main())
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/clipkit.py", line 111, in main
    execute(**process_args(args))
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/clipkit.py", line 59, in execute
    input_file, file_format=input_file_format
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/files.py", line 49, in get_alignment_and_format
    raise Exception("Input file could not be read")
Exception: Input file could not be read

Although, when I try to force the file format -if fasta I get a slightly different error.

(ref_creation) swyka@taxon-pipeline:~/references/2020-09-10_updates$ clipkit EOG09261EM7.fasta -o EOG09261EM7_trimmed.fasta -if fasta
Traceback (most recent call last):
  File "/home/swyka/miniconda3/envs/ref_creation/bin/clipkit", line 8, in <module>
    sys.exit(main())
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/clipkit.py", line 111, in main
    execute(**process_args(args))
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/clipkit.py", line 59, in execute
    input_file, file_format=input_file_format
  File "/home/swyka/miniconda3/envs/ref_creation/lib/python3.6/site-packages/clipkit/files.py", line 34, in get_alignment_and_format
    alignment = AlignIO.read(open(inFile), file_format.value)
AttributeError: 'str' object has no attribute 'value'

This is the head of my fasta file, for reference.

(ref_creation) swyka@taxon-pipeline:~/references/2020-09-10_updates$ head EOG09261EM7.fasta 
>XP_001588528.1|GCF_000146945.2|EOG09261EM7
MYSRYVPPSKKKVTVGDQLSQAPLSPLKSSSPPPPPSPAPAIRPDASSSYARYIPPSKSNVKPDAQLNAPKLGLESPSPASKRKREDALEPGPEPVLKKAKKDKKEKSVKVHASATLDEPHADNASSEEAQEVTKKDKKQKKKKSSDDSTSSEETTENPEDIDDKRHKKVLEKREKSIKKAERRARKAAEEGRDAEDAQEPEEPVEIHNLVPLPQPEPIPELPPPSLESTLPSWLASPILVSPTTT
AQFSEVGVEAEAATVLRSKGFNEAFAVQAAVLPLLLHGTRQKPGDVLVSAATGSGKTLSYVLPMTQDISRNIVTRLRGLIVMPTRELVSQAREVCEVCSSAFSAGSRKRVKIGTAVGNEAFKVEQANLMENTYKYDPARYHEQERRKNLRWESSDAGTDDEGEPLLDDEAISPLPDHIIEPVSKVDILICTPGRLVEHLKFTPGFTLEYVKWLVIDEADKLLDQSFQQWLNVVMSSLATGQNSFPN
NRDRVRKIVLSATMTRDIGQLTSLKLYRPKLVVLAGSSAGDDGKSSHAHILPPGLVEFAVKVDDENLKPLYLMEILKGNDMIDDSKIKSDSDTDSDTDSDSDSDDSSSDSSSDKDSSEDSSSSDDSSISGSDSESKPDKVSPKSKPKFKTNPLPISHEPHGVLIFTKSNESAIRLGRLISLIHPSYTEIIGTLTSTTRSSERKASLASFSRGKLQILVASDLVSRGLDLPDLAHVINYDVPTSITN
YVHRVGRTARAGRQGHAWTLVGNSEARWFFNEVAKSEEIRRRESAKVKRIVVDARKFGEYKKETYEEALEELGQEAMAIRSKK
>TGO71980.1|GCA_004786205.1|EOG09261EM7
MYSRYVPPSKKKVATQEKSIEAPPSPPKSSSPPPPLAPPATRPDASSTYARYIPPSKSKSKPNTQLDAPITAPESPSPASKRKHEDVVQPTPEPISKKARKEKKEKSVTALMSSVSEESHKDTISSEEAQEVTKKDKKSKKSKSKVETIPSRDTAGSNEDIDDTRHKRVLEKREKSIKKAERRARKAAEEGVAVENAQEPEEPAEIHDLVPLPQPEPIPELPPPSLESTLPSWLASPILVSPTTTA
EFSDLGVEVEAANVLRSKSFNEAFAVQAAVLPLLLPGSQQRPGDVLVSAATGSGKTLSYVLPMTQDISRNTVTRLRGLIVMPTRELVSQAREVCEVCSSAFSVGSRKRVKIGTAVGNEAFKVEQANLMENTHRYDPILYHEQEQRKNSRWESSDAGTDDEGEPFFDDEIVSPLPDHVIEPVSKVDILICTPGRLVEHLKSTPGFTLQHVKWLVIDEADKLLDQSFQQWLDIVMNSLAAGQKSLPSN
KDRVRKIVLSATMTRDIGQLTSLKLYRPKLVVLEGSSAGDDGKGSQAHVLPSGLAEFAVKVDDENLKPLYLIEILKGNNMIDESEIKSDTDTDSDSDSSSDSSSDEDSSEDSSSSDDSSESDKDSDSKPDSRPSSKSKSKIKTNPLPINHEPHGVLIFTKSNESAIRLGRLLSLINDSYTNIIGTLTSTTRSSERRASIASFSRGKLQILVASDLVSRGLDLPDLAHVINYDVPTSITNYVHRVGR
TARAGRQGSAWTLVGNTEARWFFNEVAKSEEIRRRDGEKVKRVVVDARKFGEYKKETYEDALEELGQEAMAIRKKKSS
>EPQ63759.1|GCA_000418435.1|EOG09261EM7
MNSYSYARYVPPPKSKISESIRDSTSSQPISSSKSVITCDTYHDASASYARYLPSKPISNGANSAPKFDRNSSPSSSLAKIGTRSLKPEEGPLPNIKSMNAPLSRKSPTQQGRSPSDQISSKSKRYVKTFKSDTNAQVIMDNQAYSEPSIDINKINRNKNNSNPDQRTKFNQEGLKSPSKSGSPPSGTYHQNKSEPSATIEQGKISKLDLQNVGSINVKKRARDELDSGDEKSTKNHNKILKKREK
SLKKIKAKEDASKGLPLMEPSTKVSVELHDLVPLPQAELLPPATLVSIADSYPPWMANPTHVTTQKKVSFKELGLENHTIKVLQEKGFNEAFAIQAAVLPLLISNTERERGDIVISAATGSGKTLAYTLPMIKDISYHKITKLRGLIILPTRELVFQVKEMAETCVTAFYHNKKKRVKIGTAVGNESLSVEQSHLIDYDQKYDPEEYSKRLNRIDLAWKSAEDVDDGFHSIFEESCTGKFPDHISY
QKVNVDILICTPGRLVEHMKCTSGFSIKDVSWLVIDEADKLLDQSFQQWLPLVMAEVESKHTEPLVKRRRVQKLILSATMTRDLGELAQLKLYRPKLITLDSEMEIDGDSQTHNLPCTLWESGVKVEEDGIKPLYLMEILKRESVMSIADLEDSYPSDTDSLDSSLDSETSDSSTTVSKERSTSKVLSLSPDSNSNNPQGVLIFTNSNVTAVRLGRIIPLLCPNLATRIGVLTSSLPRSSRQQCIR
SFCKGIISVIVASDLVSRGLDLPNLAHVINYDIPTSVVSYIHRVGRTARAGKEGRAWTLFTATEGRWFWNEIGRSAHIHRPKGKIVRTNISKETFSEEQRRTYGLALETLEKEATGVEICKWT
LifeMine-swyka commented 4 years ago

Sorry, was a file error. Was accidentally using original unaligned fasta file.

JLSteenwyk commented 4 years ago

No problem! Thank you for using ClipKIT. We hope you found it useful for your research