COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
776 stars 164 forks source link

Salmon v0.9.0 index FastxParser error #176

Closed bounlu closed 6 years ago

bounlu commented 6 years ago

I get the FastxParser error as below during indexing with Salmon v0.9.0.

What does that mean? Has the indexing run successfully? Should I worry about anything?

$ salmon index -t gencode.v27lift37.transcripts.fa -i salmon_gencode/
Version Info: This is the most recent version of Salmon.
[2017-11-28 16:51:37.147] [jLog] [info] building index
RapMap Indexer

[Step 1 of 4] : counting k-mers
[2017-11-28 16:51:38.341] [jointLog] [warning] Entry with header [ENST00000594394.1|ENSG00000268141.1|-|-|AL606500.1-201|AL606500.1|21|protein_coding|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:51:40.503] [jointLog] [warning] Entry with header [ENST00000595481.1|ENSG00000268556.1|-|-|AC012360.1-201|AC012360.1|30|protein_coding|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:51:47.865] [jointLog] [warning] Entry with header [ENST00000473810.1_1|ENSG00000239255.1_3|OTTHUMG00000157482.1_3|OTTHUMT00000348942.1_1|AC007620.1-201|AC007620.1|25|processed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:51:48.079] [jointLog] [warning] Entry with header [ENST00000603775.1_1|ENSG00000271544.1_3|OTTHUMG00000184300.1_3|OTTHUMT00000468575.1_1|AC006499.8-201|AC006499.8|23|processed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:00.610] [jointLog] [warning] Entry with header [ENST00000632684.1_1|ENSG00000282431.1_3|OTTHUMG00000190602.2_3|OTTHUMT00000485301.2_1|AC245427.8-201|AC245427.8|9|TR_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:11.269] [jointLog] [warning] Entry with header [ENST00000606517.1|ENSG00000271831.1|OTTHUMG00000185672.1|OTTHUMT00000470956.1|AL022345.9-001|AL022345.9|27|unprocessed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:13.209] [jointLog] [warning] Entry with header [ENST00000598940.1|ENSG00000269766.1|-|-|AL356215.1-201|AL356215.1|30|protein_coding|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:16.678] [jointLog] [warning] Entry with header [ENST00000598624.1|ENSG00000269531.1|-|-|AC073569.1-201|AC073569.1|30|protein_coding|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:17.351] [jointLog] [warning] Entry with header [ENST00000626826.1_1|ENSG00000281344.1_3|OTTHUMG00000189570.1_3|OTTHUMT00000479989.1_1|HELLPAR-201|HELLPAR|205012|macro_lncRNA|] was longer than 200000 nucleotides.  Are you certain that we are indexing a transcriptome and not a genome?
[2017-11-28 16:52:17.556] [jointLog] [warning] Entry with header [ENST00000543745.1_1|ENSG00000255972.1_3|OTTHUMG00000168883.1_3|OTTHUMT00000401485.1_1|AC026333.1-201|AC026333.1|28|processed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:18.458] [jointLog] [warning] Entry with header [ENST00000415118.1_3|ENSG00000223997.1_4|OTTHUMG00000170844.2_4|OTTHUMT00000410670.2_3|TRDD1-201|TRDD1|8|TR_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:18.458] [jointLog] [warning] Entry with header [ENST00000434970.2_3|ENSG00000237235.2_3|OTTHUMG00000170845.2_3|OTTHUMT00000410671.2_3|TRDD2-201|TRDD2|9|TR_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:18.458] [jointLog] [warning] Entry with header [ENST00000448914.1_2|ENSG00000228985.1_3|OTTHUMG00000170846.2_3|OTTHUMT00000410672.2_2|TRDD3-201|TRDD3|13|TR_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000439842.1_1|ENSG00000236597.1_2|OTTHUMG00000152435.2_2|OTTHUMT00000326213.2_1|IGHD7-27-201|IGHD7-27|11|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000390567.1_1|ENSG00000211907.1_2|OTTHUMG00000152429.2_2|OTTHUMT00000326207.2_1|IGHD1-26-201|IGHD1-26|20|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000452198.1_1|ENSG00000225825.1_2|OTTHUMG00000152436.2_2|OTTHUMT00000326214.2_1|IGHD6-25-201|IGHD6-25|18|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000390569.1_1|ENSG00000211909.1_2|OTTHUMG00000152427.2_2|OTTHUMT00000326205.2_1|IGHD5-24-201|IGHD5-24|20|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000437320.1_2|ENSG00000227196.1_3|OTTHUMG00000152438.2_3|OTTHUMT00000326216.2_2|IGHD4-23-201|IGHD4-23|19|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.821] [jointLog] [warning] Entry with header [ENST00000390572.1_1|ENSG00000211912.1_2|OTTHUMG00000152428.2_2|OTTHUMT00000326206.2_1|IGHD2-21-201|IGHD2-21|28|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000450276.1_1|ENSG00000237020.1_2|OTTHUMG00000152432.2_2|OTTHUMT00000326210.2_1|IGHD1-20-201|IGHD1-20|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000390574.1_3|ENSG00000211914.1_3|OTTHUMG00000152431.2_3|OTTHUMT00000326209.2_3|IGHD6-19-201|IGHD6-19|21|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000390575.1_1|ENSG00000211915.1_2|OTTHUMG00000152433.2_2|OTTHUMT00000326211.2_1|IGHD5-18-201|IGHD5-18|20|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000431870.1_2|ENSG00000227800.1_3|OTTHUMG00000152437.2_3|OTTHUMT00000326215.2_2|IGHD4-17-201|IGHD4-17|16|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000451044.1_3|ENSG00000227108.1_3|OTTHUMG00000152369.2_3|OTTHUMT00000326003.2_3|IGHD1-14-201|IGHD1-14|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000390580.1_1|ENSG00000211920.1_2|OTTHUMG00000152370.2_2|OTTHUMT00000326004.2_1|IGHD6-13-201|IGHD6-13|21|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000390581.1_1|ENSG00000211921.1_2|OTTHUMG00000152367.2_2|OTTHUMT00000326001.2_1|IGHD5-12-201|IGHD5-12|23|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000431440.2_1|ENSG00000232543.2_2|OTTHUMG00000152368.2_2|OTTHUMT00000326002.2_1|IGHD4-11-201|IGHD4-11|16|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000430425.1_1|ENSG00000237197.1_2|OTTHUMG00000152357.2_2|OTTHUMT00000325963.2_1|IGHD1-7-201|IGHD1-7|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000454691.1_1|ENSG00000228131.1_2|OTTHUMG00000152353.2_2|OTTHUMT00000325959.2_1|IGHD6-6-201|IGHD6-6|18|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000390588.1_1|ENSG00000211928.1_2|OTTHUMG00000152360.2_2|OTTHUMT00000325966.2_1|IGHD5-5-201|IGHD5-5|20|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.822] [jointLog] [warning] Entry with header [ENST00000414852.1_1|ENSG00000233655.1_2|OTTHUMG00000152355.2_2|OTTHUMT00000325961.2_1|IGHD4-4-201|IGHD4-4|16|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.823] [jointLog] [warning] Entry with header [ENST00000454908.1_1|ENSG00000236170.1_2|OTTHUMG00000152359.2_2|OTTHUMT00000325965.2_1|IGHD1-1-201|IGHD1-1|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.824] [jointLog] [warning] Entry with header [ENST00000518246.1_1|ENSG00000254045.1_2|OTTHUMG00000152060.1_2|OTTHUMT00000325154.1_1|IGHVIII-22-2-201|IGHVIII-22-2|28|IG_V_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.827] [jointLog] [warning] Entry with header [ENST00000604642.1_1|ENSG00000270961.1_2|OTTHUMG00000184622.2_2|OTTHUMT00000468982.2_1|IGHD5OR15-5A-201|IGHD5OR15-5A|23|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.827] [jointLog] [warning] Entry with header [ENST00000603326.1_1|ENSG00000271317.1_2|OTTHUMG00000184621.3_2|OTTHUMT00000468981.3_1|IGHD4OR15-4A-201|IGHD4OR15-4A|19|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.827] [jointLog] [warning] Entry with header [ENST00000605284.1_2|ENSG00000271336.1_3|OTTHUMG00000184580.2_3|OTTHUMT00000468908.2_2|IGHD1OR15-1A-201|IGHD1OR15-1A|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.833] [jointLog] [warning] Entry with header [ENST00000604446.1_1|ENSG00000270824.1_2|OTTHUMG00000184624.2_2|OTTHUMT00000468984.2_1|IGHD5OR15-5B-201|IGHD5OR15-5B|23|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.833] [jointLog] [warning] Entry with header [ENST00000603693.1_1|ENSG00000270451.1_2|OTTHUMG00000184611.3_2|OTTHUMT00000468945.3_1|IGHD4OR15-4B-201|IGHD4OR15-4B|19|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:19.833] [jointLog] [warning] Entry with header [ENST00000604838.1_1|ENSG00000270185.1_2|OTTHUMG00000184585.2_2|OTTHUMT00000468915.2_1|IGHD1OR15-1B-201|IGHD1OR15-1B|17|IG_D_gene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:28.813] [jointLog] [warning] Entry with header [ENST00000579054.1_1|ENSG00000266416.1_3|OTTHUMG00000179204.1_3|OTTHUMT00000445280.1_1|AC130289.2-201|AC130289.2|28|processed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
[2017-11-28 16:52:32.075] [jointLog] [warning] Entry with header [ENST00000634174.1_1|ENSG00000282732.1_3|OTTHUMG00000191398.1_3|OTTHUMT00000487783.1_1|AC073539.7-201|AC073539.7|28|unprocessed_pseudogene|], had length less than the k-mer length of 31 (perhaps after poly-A clipping)
Elapsed time: 59.38s

[2017-11-28 16:52:36.527] [jointLog] [warning] Removed 939 transcripts that were sequence duplicates of indexed transcripts.
[2017-11-28 16:52:36.527] [jointLog] [warning] If you wish to retain duplicate transcripts, please use the `--keepDuplicates` flag
Replaced 1 non-ATCG nucleotides
Clipped poly-A tails from 1555 transcripts
Building rank-select dictionary and saving to disk done
Elapsed time: 0.0308857s
Writing sequence data to file . . . done
Elapsed time: 0.368073s
[info] Building 32-bit suffix array (length of generalized text is 300120034)
Building suffix array . . . success
saving to disk . . . done
Elapsed time: 1.2529s
done
Elapsed time: 91.367s
processed 300000000 positions
khash had 128912443 keys
saving hash to disk . . . done
Elapsed time: 10.1643s

**Encountered FastxParser destructor while parser was still marked active (or while parsing threads were still active). Be sure to call stop() before letting FastxParser leave scope!**

[2017-11-28 16:59:48.161] [jLog] [info] done building index
rob-p commented 6 years ago

Hi @bounlu ,

This warning is harmless, and comes from the fact that the parsing code improved but the rapmap code being called to build the index is forgetting to call a particular method. So, the destructor detects this and calls it instead. This isn't a problem, and won't affect the results. However, I'll push a patch release that eliminates this issue (either today or tomorrow sometime). Until then, you can go ahead using this index, it should work fine.

rob-p commented 6 years ago

This shoild be fixed in 0.9.1 (though again, it should not have caused any issue in practice).