Closed TransGirlCodes closed 3 years ago
That sounds right. In general, I think data processing shouldn't be done on FASTA.Record or FASTQ.Record - because these objects are not Julia native data structures, they're useful for IO, but not more.
Transcoding between BAM/FASTQ/FASTA is a bit niche, but does require working directly on Records. So a package for BioHTSFormats or something would be a good idea.
Merging #53 (1d76c18) into master (0d81f16) will increase coverage by
0.28%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #53 +/- ##
==========================================
+ Coverage 83.60% 83.89% +0.28%
==========================================
Files 11 12 +1
Lines 616 627 +11
==========================================
+ Hits 515 526 +11
Misses 101 101
Flag | Coverage Δ | |
---|---|---|
unittests | 83.89% <100.00%> (+0.28%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
src/FASTX.jl | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0d81f16...1d76c18. Read the comment docs.
That sounds right. In general, I think data processing shouldn't be done on FASTA.Record or FASTQ.Record - because these objects are not Julia native data structures, they're useful for IO, but not more.
Yes, I think I largely agree, with some exceptions for certain file processing tasks where parsing fully into native structures is overkill, I'm thinking, trimming, masking, converting etc. Which are simple, and we can implement them, so as for the user, they use the API to do those limited things for the most common file operations, but otherwise read into our native structures for serious analysis.
Ok this looks like it is working fine, with the exception of for FASTQ files with empty sequences. I believe because FASTA parsing rejects it. So we may have to look at whether we allow a record in a file to have no sequence or not, and ensure it is consistent between FASTA and FASTQ.
For issue #50
For a consistent conversion between HTS formats, we'll probably need a separate package with a more generic version of the
transcode
function I've implemented here. Possibly usingconvert
andpromote
, although for this example a simpleFASTA.Record
constructor accepting aFASTQ.Record
was enough.Types of changes
This PR implements the following changes: (Please tick any or all of the following that are applicable)
:ballot_box_with_check: Checklist
docs/src/
.[UNRELEASED]
section of the manually curatedCHANGELOG.md
file for this repository.