TrinityCTAT / CTAT-LR-fusion

fusion transcript detection using long reads, leveraging ctat-minimap2 and FusionInspector
Other
8 stars 0 forks source link

CTAT-LR "rightfq" and "leftfq" args don't support lists of filepaths, but FusionInspector does #3

Open ljwoods2 opened 2 weeks ago

ljwoods2 commented 2 weeks ago

When passing the --rightfq and --leftfq args, the $ensure_full_path method is only effectively called on the first file in the list of fastq files (ensure_full_path doesn't support lists of files):

# From line CTAT-LR-fusion/ctat-lr-fusion line 234
if ($left_fq ne "NA") {
    $left_fq = &ensure_full_path($left_fq);
}
if ($right_fq ne "NA") {
    $right_fq = &ensure_full_path($right_fq);
}

This is problematic since FusionInspector allows lists of fastq files separated by ",":

# From FusionInspector/FusionInspector line 620
if args_parsed.right_fq_filename:
    fq_filenames = args_parsed.right_fq_filename.split(",")
    for i, fq_filename in enumerate(fq_filenames):
        fq_filenames[i] = os.path.abspath(fq_filename)
        check_files_exist([fq_filename])
    args_parsed.right_fq_filename = ",".join(fq_filenames)

Meaning only the first right and left fastq file will be converted to absolute paths and a file not found error will be raised on the elements of the fastq file list which were not converted to absolute paths later on (can provide example logs if needed)

Maybe CTAT-LR does not support multiple short-read fq pairs and this was intentional behavior? I couldn't find this documented anywhere, please let me know if I'm missing something

brianjohnhaas commented 1 week ago

Thanks for pointing this out. You can leave this open and I'll aim to add support for multiple fastqs in ctat-lr-fusion for the next release

On Fri, Jun 21, 2024 at 2:04 PM ljwoods2 @.***> wrote:

When passing the --rightfq and --leftfq args, the $ensure_full_path method is only effectively called on the first file in the list of fastq files (ensure_full_path doesn't support lists of files):

From line CTAT-LR-fusion/ctat-lr-fusion line 234if ($left_fq ne "NA") {

$left_fq = &ensure_full_path($left_fq);

}if ($right_fq ne "NA") { $right_fq = &ensure_full_path($right_fq); }

This is problematic since FusionInspector allows lists of fastq files separated by ",":

From FusionInspector/FusionInspector line 620if args_parsed.right_fq_filename:

fq_filenames = args_parsed.right_fq_filename.split(",")
for i, fq_filename in enumerate(fq_filenames):
    fq_filenames[i] = os.path.abspath(fq_filename)
    check_files_exist([fq_filename])
args_parsed.right_fq_filename = ",".join(fq_filenames)

Meaning only the first right and left fastq file will be converted to absolute paths and a file not found error will be raised on the elements of the fastq file list which were not converted to absolute paths later on (can provide example logs if needed)

Maybe CTAT-LR does not support multiple short-read fq pairs and this was intentional behavior?

— Reply to this email directly, view it on GitHub https://github.com/TrinityCTAT/CTAT-LR-fusion/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX67JIJJOZFTHGTVQNTZIRTKJAVCNFSM6AAAAABJWQFJGOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DOMBRGM3DANQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas