The Flux simulator seems to not respect the effective length of transcripts during it's simulation. This means that quantification tools that "correctly" adjust for effective length will be penalized for this correction (more strongly under relative abundance measures such as TPM than via the estimated number of reads). Since the Flux simulator produces (via its .lib file) the actual set of fragment lengths present in the underlying library, it is possible to compute "true" effective lengths for each transcript, which can then be used when computing the ground-truth TPM values. It might be worth allowing this as an option to piquant, or reporting the accuracy of different with respect to the "true" TPM computed in both ways.
Here is a gist that implements computation of the effective lengths for transcripts given the Flux simulator's .lib file and a dataframe containing the un-corrected lengths. Let me know if you think this makes sense to include.
The Flux simulator seems to not respect the effective length of transcripts during it's simulation. This means that quantification tools that "correctly" adjust for effective length will be penalized for this correction (more strongly under relative abundance measures such as TPM than via the estimated number of reads). Since the Flux simulator produces (via its .lib file) the actual set of fragment lengths present in the underlying library, it is possible to compute "true" effective lengths for each transcript, which can then be used when computing the ground-truth TPM values. It might be worth allowing this as an option to piquant, or reporting the accuracy of different with respect to the "true" TPM computed in both ways.
Here is a gist that implements computation of the effective lengths for transcripts given the Flux simulator's .lib file and a dataframe containing the un-corrected lengths. Let me know if you think this makes sense to include.