rki-mf1 / covsonar

A database-driven system for handling genomic sequences of SARS-CoV-2 and screening genomic profiles.
GNU General Public License v3.0
6 stars 0 forks source link

Error when extracting TSV from rki.db 2021-12-07 #82

Closed hoelzer closed 1 year ago

hoelzer commented 1 year ago

A weird error, maybe due to my covsonar installation.

python3 /scratch/hoelzerm/git/covsonar/sonar.py match --db rki.db --date 2021-12-07:2021-12-07 --tsv > out.tsv

If it is not reproducible/ solved with a newer version, please feel free to close.

I am using

sonar.py 1.1.8
stephan-fuchs commented 1 year ago

Hi Martin,

your notation includes an invalid date range. To filter based on a single date please use the following command: python3 /scratch/hoelzerm/git/covsonar/sonar.py match --db rki.db --date 2021-12-07 --tsv > out.tsv

Please let me know, if this solved your issue.

hoelzer commented 1 year ago

Hey, I get the same error:

python3 /scratch/hoelzerm/git/covsonar/sonar.py match --db rki.db --date 2021-12-07 --tsv > desh.tsv

Traceback (most recent call last):
  File "/scratch/hoelzerm/git/covsonar/sonar.py", line 530, in <module>
    snr.match_genomes(include_profiles=args.include, exclude_profiles=args.exclude, accessions=args.acc, lineages=args.lineage, with_sublineage=args.with_sublineage, zips=args.zip, dates=args.date, submission_dates=args.submission_date, labs=args.lab, sources=args.source, collections=args.collection, technologies=args.technology, platforms=args.platform, chemistries=args.chemistry, software=args.software, software_version=args.version, materials=args.material, min_ct=args.min_ct, max_ct=args.max_ct, ambig=args.ambig, count=args.count, frameshifts=frameshifts, tsv=args.tsv)
  File "/scratch/hoelzerm/git/covsonar/sonar.py", line 301, in match_genomes
    rows = self.db.match(include_profiles=include_profiles, exclude_profiles=exclude_profiles, accessions=accessions, lineages=lineages, with_sublineage=with_sublineage, zips=zips, dates=dates, submission_dates=submission_dates, labs=labs, sources=sources, collections=collections, technologies=technologies, platforms=platforms, chemistries=chemistries, software=software, software_version=software_version, materials=materials, min_ct=min_ct, max_ct=max_ct, ambig=ambig, count=count, frameshifts=frameshifts, debug=debug)
  File "/scratch/hoelzerm/git/covsonar/lib/sonardb.py", line 3069, in match
    rows[i]['dna_profile'] = self.filter_ambig(rows[i]['dna_profile'], self.iupac_explicit_nt_code, keep)
  File "/scratch/hoelzerm/git/covsonar/lib/sonardb.py", line 2646, in filter_ambig
    for mutation in list(filter(None, profile.split(" "))):
AttributeError: 'NoneType' object has no attribute 'split'

It does not happen when I use a date range before or after 2021-12-07 : )

Hach, edge case Martin on the road ;)

stephan-fuchs commented 1 year ago

Hello Martin,

I apologize for getting back to you so late. I wanted to follow up on the error you reported earlier. Could you please try out the version 1.1.7 of covSonar that can be setup by running the following commands:

git clone -b v1.1.7 https://github.com/rki-mf1/covsonar.git cd covsonar sonar.py --version

Please let me know if you are still encountering the same error. Thank you for your patience and cooperation in resolving this issue.

ChaseWLW commented 1 year ago

Hello @stephan-fuchs,

I ran into the same error message while running covsonar v1.1.8. The command is as follows:

python sonar.py match --db $DB --cpus 36 --with-sublineage --lineage BA.% --tsv

$DB is gisaid_copy.db.

Trying v1.1.7 now and will update on what happens.

matthuska commented 1 year ago

I'm using the release version of covsonar 1.1.7 and get the same error, using a date range that doesn't include Martin's 2021-12-07:

Command:

$ sonar.py match --tsv --date 2021-02-01:2021-06-01 --db rki.db

Output

  File "<redacted>/src/covsonar-1.1.7/sonar.py", line 530, in <module>
    snr.match_genomes(include_profiles=args.include, exclude_profiles=args.exclude, accessions=args.acc, lineages=args.lineage, with_sublineage=args.with_sublineage, zips=args.zip, dates=args.date, submission_dates=args.submission_date, labs=args.lab, sources=args.source, collections=args.collection, technologies=args.technology, platforms=args.platform, chemistries=args.chemistry, software=args.software, software_version=args.version, materials=args.material, min_ct=args.min_ct, max_ct=args.max_ct, ambig=args.ambig, count=args.count, frameshifts=frameshifts, tsv=args.tsv)
  File "<redacted>/src/covsonar-1.1.7/sonar.py", line 301, in match_genomes
    rows = self.db.match(include_profiles=include_profiles, exclude_profiles=exclude_profiles, accessions=accessions, lineages=lineages, with_sublineage=with_sublineage, zips=zips, dates=dates, submission_dates=submission_dates, labs=labs, sources=sources, collections=collections, technologies=technologies, platforms=platforms, chemistries=chemistries, software=software, software_version=software_version, materials=materials, min_ct=min_ct, max_ct=max_ct, ambig=ambig, count=count, frameshifts=frameshifts, debug=debug)
  File "<redacted>/src/covsonar-1.1.7/lib/sonardb.py", line 3045, in match
    rows[i]['dna_profile'] = self.filter_ambig(rows[i]['dna_profile'], self.iupac_explicit_nt_code, keep)
  File "<redacted>/src/covsonar-1.1.7/lib/sonardb.py", line 2622, in filter_ambig
    for mutation in list(filter(None, profile.split(" "))):
AttributeError: 'NoneType' object has no attribute 'split'
ashkan98 commented 1 year ago

Hi all,

I get the same error for extracting a tsv for "BA.2" with: python3 covsonar/sonar.py match --lineage BA.2 --date 2021-12-02:2021-12-31 --db rki.db --tsv > BA2.tsv

After shrinking the startdate down by brute-force I found out when the first sample was introduced (2021-12-10) in rki.db and could generate the tsv at --date 2021-12-02:2021-12-31

matthuska commented 1 year ago

As far as I can tell, this was fixed by #113 . If anyone still has crashes with specific date ranges please create a new issue, making sure to include the sonar version you're using and example command.