IndexThePlanet / Logan

Logan Unitigs and Contigs
104 stars 3 forks source link

Headers missing sample accession in SRR19579649 #13

Open apcamargo opened 1 month ago

apcamargo commented 1 month ago

Both the contigs and unitigs of the sample SRR19579649 are missing the accession in their headers.

aws s3 cp s3://logan-pub/u/SRR19579649/SRR19579649.unitigs.fa.zst - | seqkit fx2tab -n | head -n 5
_0 ka:f:4.6   L:-:385190:+
_1 ka:f:2.0
_2 ka:f:1.9
_3 ka:f:14.2   L:-:758552:+ L:-:758554:+  L:+:56269:+
_4 ka:f:4.9    L:+:1214:+

Other than the headers, everything looks fine in this sample.

apcamargo commented 1 month ago

Just found the same issue in ERR3638815 and ERR4407703:

aws s3 cp s3://logan-pub/u/ERR3638815/ERR3638815.unitigs.fa.zst - | seqkit fx2tab -n | head -n 5
_0 ka:f:1.8
_1 ka:f:2.0   L:+:19185:-  L:-:244:+
_2 ka:f:13.8   L:+:5688:+ L:+:31784:+  L:-:67094:- L:-:67095:-
_3 ka:f:16.8   L:-:9465:-  L:+:7:+
_4 ka:f:2.5   L:+:83969:- L:+:84123:- L:+:84720:-
aws s3 cp s3://logan-pub/u/ERR4407703/ERR4407703.unitigs.fa.zst - | seqkit fx2tab -n | head -n 5
_0 ka:f:3.5
_1 ka:f:2.0
_2 ka:f:2.0
_3 ka:f:2.1
_4 ka:f:5.1   L:+:146773:+
rchikhi commented 3 weeks ago

hi Antonio, thanks for flagging this! Indeed I recall this can happen in a relatively small number of accessions

rchikhi commented 3 weeks ago

hoping to fix it at the same time as the kmer repeating problem

apcamargo commented 2 weeks ago

Thanks, @rchikhi! Let me know if I can help in any way