brentp / hts-nim

nim wrapper for htslib for parsing genomics data files
https://brentp.github.io/hts-nim/
MIT License
155 stars 26 forks source link

Trouble with newRecord "undeclared identifier" in bam #79

Open akkellogg opened 2 years ago

akkellogg commented 2 years ago

Hi, I'm having trouble initiating new records based on an old Bam file. My goal is to make modifications to certain fields (i.e mapping quality) based on criteria within a record.

Since I can't modify the existing record directly, I figured I would generate new records with the desired modifications but maintain old fields from the original record. However, I keep running into an error when trying to compile. Below is the relevant code up to where the compilation fails:

import hts

var
   oaln: Bam

proc copy_most_fields(old_read: Record, old_bam: Bam): Record =
   var header = old_bam.hdr
   var new_read = newRecord(header)

The compilation error I receive is:

./hts_nim_static_builder -s src/my_script.nim -n my_script.nimble {"--debug": false, "--nim-src-file": src/my_script.nim, "": [], "--": false, "--nimble-file": my_script.nimble, "--tag": latest, "--deps": []} docker run -v /path/to/my/script/my_script:/path/to/my/script/my_script -v /path/to/my/script/my_script:/load/ brentp/musl-hts-nim:latest /usr/local/bin/nsb -n /path/to/my/script/my_script/my_script.nimble -s /path/to/my/script/my_script/src/my_script.nim -- -d:danger -d:release @["-n", "/path/to/my/script/my_script/my_script.nimble", "-s", "/path/to/my/script/my_script/src/my_script.nim", "--", "-d:danger", "-d:release"] {"": ["-d:danger", "-d:release"], "--nim-src-file": /path/to/my/script/my_script/src/my_script.nim, "--": true, "--deps": [], "--nimble-file": /path/to/my/script/my_script/my_script.nimble} Verifying dependencies for my_script@0.1.0 Info: Dependency on hts@any version already satisfied Verifying dependencies for hts@0.3.16 Hint: used config file '/nim/config/nim.cfg' [Conf] Hint: used config file '/nim/config/config.nims' [Conf] ......................... /path/to/my/script/my_script/src/my_script.nim(10, 22) Error: undeclared identifier: 'newRecord'

I am using the hts_nim_static_builder and I have hts-nim installed, so I'm not sure if this is user error or an issue with the function

brentp commented 2 years ago

Hi, you are using newRecord it must be NewRecord. nim is usually not case-sensitive for variable names, but the first letter matters.

akkellogg commented 2 years ago

Not sure how I missed that. I kept looking back and forth at the docs and didn't process that capital N.

Thanks!

akkellogg commented 2 years ago

Following up with another issue regarding this script. I'm no longer getting an error regarding creating the new record, however, I am having trouble filling in fields I want to copy from the original read:

proc copy_most_fields(old_read: Record, old_bam: Bam): Record =
  var header = old_bam.hdr
  var new_read = NewRecord(header)
  new_read.set_qname(old_read.qname())
  new_read.SamField.SAM_SEQ = old_read.SamField.SAM_SEQ
  new_read.SamField.SAM_FLAG = old_read.SamField.SAM_FLAG
  new_read.SamField.SAM_POS = old_read.SamField.SAM_POS
  new_read.SamField.SAM_CIGAR = old_read.SamField.SAM_CIGAR
  new_read.SamField.SAM_RNAME = old_read.SamField.SAM_RNAME
  new_read.SamField.SAM_RNEXT = old_read.SamField.SAM_RNEXT
  new_read.SamField.SAM_PNEXT = old_read.SamField.SAM_PNEXT
  new_read.SamField.SAM_TLEN = old_read.SamField.SAM_TLEN
  new_read.SamField.SAM_QUAL = old_read.SamField.SAM_QUAL
  new_read.isize = old_read.isize
  return new_read

But when I try to compile I get /path/to/my/script/my_script/src/my_script.nim(11, 11) Error: type mismatch: got <Record> but expected 'SamField = enum' which would pertain to the line new_read.SamField.SAM_SEQ = old_read.SamField.SAM_SEQ I'm not sure what is the best way to set fields in the new record

brentp commented 2 years ago

Hi, this is not how those fields work. SamField is an enum, not a property of a record.

I would use: var new_read = old_read.copy() and then modify the fields that need to be modified. `SamFields are mostly used as arguments to bam.set_option.

akkellogg commented 2 years ago

However, copying every field of the full read, means I can no longer update the mapping_quality field

I've tried:

proc copy_most_fields(old_read: Record, old_bam: Bam): Record =
  var header = old_bam.hdr
  var new_read = old_read.copy()
  new_read.mapping_quality = 60
  return new_read

which returns Error: attempting to call undeclared routine: 'mapping_quality='

and I've tried:

proc copy_most_fields(old_read: Record, old_bam: Bam): Record =
  var header = old_bam.hdr
  var new_read = old_read.copy()
  new_read.mapping_quality() = 60
  return new_read

which returns Error: 'mapping_quality(new_read)' cannot be assigned to

brentp commented 2 years ago

You can assign to the mapping quality directly with: new_read.b.core.qual = 60 as I haven't exposed a setter.

akkellogg commented 2 years ago

Thank you! Looks like it is working now