drivenbyentropy / aptasuite

A full-featured bioinformatics software collection for the comprehensive analysis of aptamers in HT-SELEX experiments.
https://drivenbyentropy.github.io/
GNU General Public License v3.0
24 stars 11 forks source link

Export problems (Crash during clustering) #97

Open darbark opened 3 years ago

darbark commented 3 years ago

All data after completed analysis are exported as mapdb files even if the required format is written in a config file

drivenbyentropy commented 3 years ago

Hi,

Could you please describe how you are exporting the data? I believe you might be missing a step to instruct AptaSuite to write the sequences to disk in the correct format.

See the wiki for more information.

darbark commented 3 years ago

Sorry, I was wrong. The problem is not the output files format, but that the program crashes during the clustering stage with the exception

Exception in thread "main" java.util.NoSuchElementException: Key 'Aptacluster.RandomizedRegionSize' does not map to an existing object!
        at org.apache.commons.configuration2.AbstractConfiguration.throwMissingPropertyException(AbstractConfiguration.java:1902)
        at org.apache.commons.configuration2.AbstractConfiguration.checkNonNullValue(AbstractConfiguration.java:1889)
        at org.apache.commons.configuration2.AbstractConfiguration.getInt(AbstractConfiguration.java:1252)
        at aptasuite.CLI.runAptaCluster(CLI.java:622)
        at aptasuite.CLI.<init>(CLI.java:247)
        at aptasuite.Aptasuite.main(Aptasuite.java:70)

I ran the application as java -jar /home/dasha/aptasuite-0.9.6-SNAPSHOT/aptasuite-0.9.6-SNAPSHOT.jar -parse -cluster -predict structure -trace -export pool,cycles,structures,clusters -config /home/dasha/testaptasuite/test.aptasuite The config file contains the next text:

Experiment.name = b2-41_ach-1000uM_test
Experiment.description = target concentration influence on Capture-SELEX (1000 uM) test
Experiment.primer5 = GCATCAGTCCACTCGTGA
Experiment.primer3 = GTAGCGACCTCTGCTAGA
Experiment.randomizedRegionSize = 62
AptaplexParser.isPerFile = true
AptaplexParser.reader = FastqReader
SelectionCycle.name = round-03_b2-41_ach-1000uM
SelectionCycle.name = round-10_b2-41_ach-1000uM
SelectionCycle.round = 3
SelectionCycle.round = 10
SelectionCycle.isControlSelection = false
SelectionCycle.isControlSelection = false
SelectionCycle.isCounterSelection = false
SelectionCycle.isCounterSelection = false
AptaplexParser.forwardFiles = /home/dasha/testaptasuite/test/Ach-1000uM-r03_S1_L001_R1_001.fastq
AptaplexParser.forwardFiles = /home/dasha/testaptasuite/test/Ach-1000uM-r10_S6_L001_R2_001.fastq
AptaplexParser.CheckReverseComplement = true
Experiment.projectPath = /home/dasha/testaptasuite
Performance.maxNumberOfCores = 5
Export.Cycles = Round10
Export.MinimalClusterSize = 3
Export.ClusterFilterCriteria = ClusterSize
Export.compress = true
Export.SequenceFormat = fastq
Export.IncludePrimerRegions = false
Export.PoolCardinalityFormat = frequencies

After the crash a work directory looks like:

├── clusterdata
│   └── clusters.mapdb
├── cycledata
│   ├── 10_round-10_b2-41_ach-1000uM.mapdb
│   └── 3_round-03_b2-41_ach-1000uM.mapdb
├── logs
│   └── log_2020-10-27_17-41-30.txt
├── metadata.mapdb
├── output.log
├── pooldata
│   ├── bounds_data0000.mapdb
│   ├── data0000.mapdb
│   └── inverse_data0000.mapdb
├── structuredata
│   ├── data0000.mapdb
│   └── data0001.mapdb
├── test
│   ├── Ach-1000uM-r03_S1_L001_R1_001.fastq
│   └── Ach-1000uM-r10_S6_L001_R2_001.fastq
└── test.aptasuite

6 directories, 14 files

Also, I've attached output.log file and log_2020-10-27_17-41-30.txt file generated by aptasuite

msmainy commented 2 years ago

Hi,

Did you manage to figure it out? Ive done the same thing but have the same problem where it wont cluster...


Exception in thread "main" java.util.NoSuchElementException: Key 'Aptacluster.RandomizedRegionSize' does not map to an existing object! at org.apache.commons.configuration2.AbstractConfiguration.throwMissingPropertyException(AbstractConfiguration.java:1902) at org.apache.commons.configuration2.AbstractConfiguration.checkNonNullValue(AbstractConfiguration.java:1889) at org.apache.commons.configuration2.AbstractConfiguration.getInt(AbstractConfiguration.java:1252) at aptasuite.CLI.runAptaCluster(CLI.java:622) at aptasuite.CLI.(CLI.java:247) at aptasuite.Aptasuite.main(Aptasuite.java:70)

drivenbyentropy commented 2 years ago

Hi,

I assume you are running this in command line mode. In this case, you might be missing the appropriate configuration file options related to AptaCluster. Specifically, you might need to specify Aptacluster.RandomizedRegionSize and Aptacluster.LSHDimension as described in the wiki.

Hope this helps.