I'm realizing (when running the workflow for all organisms), that the print statements for constraint violations are way too many (probably because the gtf files have a lot of repeated genes/exons). So to reduce noise in the output, I've removed the print statements. The logic of rollback is correct.
Also added a progress message for each 10K lines processed in the gtf.gz. It's better to use the logging mechanism as I've used it here instead of print statements. That way the entire logging can be turned on/off (or saved as a file or database) at runtime using a couple of python statements.
I also noticed or True in a couple of conditions, probably left over during my testing. I've got rid of these.
Removed our old scripts/ folder (everything is now available as a command).
None of these changes should affect your comparison between old/new bam databases.
I'm realizing (when running the workflow for all organisms), that the
print
statements for constraint violations are way too many (probably because the gtf files have a lot of repeated genes/exons). So to reduce noise in the output, I've removed theprint
statements. The logic of rollback is correct.Also added a progress message for each 10K lines processed in the
gtf.gz
. It's better to use thelogging
mechanism as I've used it here instead ofprint
statements. That way the entire logging can be turned on/off (or saved as a file or database) at runtime using a couple of python statements.I also noticed
or True
in a couple of conditions, probably left over during my testing. I've got rid of these.Removed our old
scripts/
folder (everything is now available as a command).None of these changes should affect your comparison between old/new bam databases.