I am trying to use your GTF reader to truncate a very large GTF file for testing purposes. However, when I attempt to create a "read" object, I get the following error. I will be happy to send you the data file, but be aware that it is 391 MB. Please let me know. I downloaded it from Embl. My code is below the TraceBack. Ignore the docstring. That is for the finished product.
My email is davidsborsheim@lewisu.edu
TRACEBACK:
/media/akiva/Data/FinalProj3/venv/bin/python /media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py
Traceback (most recent call last):
File "/media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py", line 65, in
parsegtf(inpath, outpath)
File "/media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py", line 48, in parsegtf
gtf_in = read_gtf(inpath)
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 254, in read_gtf
result_df = parse_gtf_and_expand_attributes(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 189, in parse_gtf_and_expand_attributes
df = parse_gtf(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 155, in parse_gtf
df_lazy = parse_with_polars_lazy(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 118, in parse_with_polars_lazy
df = polars.scan_csv(
TypeError: scan_csv() got an unexpected keyword argument 'sep'
My Code:
def parsegtf(infiile, outfile):
'''docstring
Function to read a gtf format file and write a test file with
either the whole file or the first 50,000 lines whichever is
shorter.
docstring'''
# Lifted from https://pypi.org/project/gtfparse/
# returns GTF with essential columns such as "feature", "seqname", "start", "end"
# alongside the names of any optional keys which appeared in the attribute column
gtf_in = read_gtf(inpath)
breakpoint()
print(gtf_in)
return
I am trying to use your GTF reader to truncate a very large GTF file for testing purposes. However, when I attempt to create a "read" object, I get the following error. I will be happy to send you the data file, but be aware that it is 391 MB. Please let me know. I downloaded it from Embl. My code is below the TraceBack. Ignore the docstring. That is for the finished product.
My email is davidsborsheim@lewisu.edu
TRACEBACK: /media/akiva/Data/FinalProj3/venv/bin/python /media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py Traceback (most recent call last): File "/media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py", line 65, in
parsegtf(inpath, outpath)
File "/media/akiva/Data/FinalProj3/ExternalCode/DataClean/PullLines.py", line 48, in parsegtf
gtf_in = read_gtf(inpath)
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 254, in read_gtf
result_df = parse_gtf_and_expand_attributes(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 189, in parse_gtf_and_expand_attributes
df = parse_gtf(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 155, in parse_gtf
df_lazy = parse_with_polars_lazy(
File "/media/akiva/Data/FinalProj3/venv/lib/python3.10/site-packages/gtfparse/read_gtf.py", line 118, in parse_with_polars_lazy
df = polars.scan_csv(
TypeError: scan_csv() got an unexpected keyword argument 'sep'
My Code: def parsegtf(infiile, outfile): '''docstring Function to read a gtf format file and write a test file with either the whole file or the first 50,000 lines whichever is shorter. docstring'''
endDef