sjteresi / TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4
GNU General Public License v3.0
30 stars 4 forks source link

Error occurred ?any help for me #140

Closed zhangwenda0518 closed 1 year ago

zhangwenda0518 commented 1 year ago

python ~/biosoft/genome-feature/TE_Density/examples/general_read_density_data.py /home/zhangwenda/biosoft/genome-feature/TE_Data/filtered_input_data/Cleaned_geta.tsv filtered_input_data/revised_input_data/ *h5

Can you give me some help,

2023-06-01 23:32:28 localhost.localdomain main[1790082] CRITICAL Error occurred while trying to read preprocessed gene annotation file into a Pandas dataframe, please refer to the README as to what information is expected

Traceback (most recent call last): File "/home/zhangwenda/biosoft/genome-feature/TE_Density/examples/general_read_density_data.py", line 62, in cleaned_genes = import_filtered_genes(args.cleaned_gene_annotation, logger) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/te_density-2.1.1-py3.10.egg/transposon/import_filtered_genes.py", line 39, in import_filtered_genes raise err File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/te_density-2.1.1-py3.10.egg/transposon/import_filtered_genes.py", line 18, in import_filtered_genes gene_data = pd.read_csv( File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, *kwargs) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, kwargs) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv return _read(filepath_or_buffer, kwds) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 605, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1442, in init self._engine = self._make_engine(f, self.engine) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1753, in _make_engine return mapping[engine](f, self.options) File "/home/zhangwenda/sysoft/anaconda3/envs/TE_Density/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 79, in init self._reader = parsers.TextReader(src, kwds) File "pandas/_libs/parsers.pyx", line 547, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 664, in pandas._libs.parsers.TextReader._get_header UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

sjteresi commented 1 year ago

Hi I'm not sure exactly what this error is but I will try to help.

First, I would verify that you are using the correct version of Python and that the packages are installed with the correct version. Second, that appears to be a string format error, do you possibly have a different format data file? It looks like it is having trouble reading something into utf-8, is there a string that cannot be converted?

teresi commented 1 year ago

File "pandas/_libs/parsers.pyx", line 664, in pandas._libs.parsers.TextReader._get_header UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Looks like Pandas is trying to read your input using the utf-8 encoding, but it's not utf-8.

I recommend converting your input file to utf-8.

This post may have relevant information: UnicodeDecodeError when reading CSV file in Pandas

Perhaps you could use file -i to determine your encoding and then iconv to convert it?