lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
476 stars 131 forks source link

Some CLPs have not been folded into the monolithic JAR #247

Closed yfarjoun closed 2 months ago

yfarjoun commented 2 months ago

Hi @lindenb

Thanks for maintaining this repo!

I wanted to note that some of the CLPs you've written over the years haven't been folded into the monolithic jvarkit CLP, and now there's scant information about how to obtain them. I, for example needed samextractclip, and as it wasn't included in the jvarkit, I cloned and experimented with ./gradlew until I figured out that

./gradlew samextractclip 

will create a new jar file with that CLP in it.

Is this the intended behaviour?

It would be nice if all the CLPs were included in jvarkit or is there were instructions on how to generate the rest....

Let me know if I could help!

Thanks again.

Verify

Subject of the issue

Describe your issue here.

Your environment

Steps to reproduce

Tell us how to reproduce this issue. Please provide an example.

Expected behaviour

Tell us what should happen

Actual behaviour

Tell us what happens instead

lindenb commented 2 months ago

@yfarjoun hi Yossi, yes, I moved most of the tools into a central command line. https://github.com/lindenb/jvarkit/blob/bc1e0e833eb6c46c01f7c21acf3a0dc55b8388c8/build.gradle#L2144

but there are still some tools that need to be moved like samextractclip . https://github.com/lindenb/jvarkit/blob/bc1e0e833eb6c46c01f7c21acf3a0dc55b8388c8/build.gradle#L1690C16-L1690C30 (i need time & motivation ! ....)

(I will not move some the IMHO useless tools, those requiring a large library like the mysql driver ) but I can quickly move samextractclip .

lindenb commented 2 months ago

@yfarjoun I moved extractclip to jvarkit/central, just tell me if you have an urgent need for some other sub-tools.

$ java -jar dist/jvarkit.jar samextractclip -h
Usage: samextractclip [options] Files
  Options:
    -c, --clipped
      Print the original Read where the clipped regions have been removed.
      Default: false
    -h, --help
      print help and exit
    --helpFormat
      What kind of help. One of [usage,markdown,xml].
    -m, --minsize
      Min size of clipped read
      Default: 5
    -p, --original
      Print Original whole Read that contained a clipped region.
      Default: false
    -o, --output
      Output file. Optional . Default: stdout
    -readFilter, --readFilter
      [20181208]A JEXL Expression that will be used to filter out some 
      sam-records (see 
      https://software.broadinstitute.org/gatk/documentation/article.php?id=1255). 
      An expression should return a boolean value (true=exclude, false=keep 
      the read). An empty expression keeps everything. The variable 'record' 
      is the current observed read, an instance of SAMRecord (https://samtools.github.io/htsjdk/javadoc/htsjdk/htsjdk/samtools/SAMRecord.html).
      Default: record.getMappingQuality()<1 || record.getDuplicateReadFlag() || record.getReadFailsVendorQualityCheckFlag() || record.isSecondaryOrSupplementary()
    -R, --reference
      For reading/writing CRAM files. Indexed fasta Reference file. This file 
      must be indexed with samtools faidx and with picard/gatk 
      CreateSequenceDictionary or samtools dict
    --version
      print version and exit
yfarjoun commented 2 months ago

That's awesome! Thank-you!

(now I need to wait for a release and a conda release....but I'll survive 😄 )