Closed hsinnan75 closed 4 years ago
You've found a clear hole in the documentation.
The headers roughly follow the Illumina format documented here , although the fields are not strictly being used in the proper sense.
From Illumina:
@<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos>:<UMI> <read>:<is filtered>:<control number>:<index>
Mapping these fields to how they are used in sim3C.
Field | Comment |
---|---|
Instrument | Indicates this read was produced by sim3C (always SIM3C) |
Run number | The random seed used during simulation |
Flowcell ID | Used to convey the type of read-pair emitted. (WGS, HIC or META3C, etc) |
Lane | Not used and always 1 |
Tile | Not used and always 1 |
x-pos | Not used and always 1 |
y-pos | A unique integer incremented as pairs are emitted during simulation |
UMI | Defined as optional and not used by sim3C (not written) |
read | Properly used to indicate whether the read is the first or second in pair (1 or 2) |
is_filtered | A flag which signifies whether the read is filtered. (always Y) |
control bits | Not used and always 18 |
index | Not used and always 1 |
After this Illumina-style header, sim3C includes a string which varies between WGS or 3C-style pairs.
For WGS the string defines the insert fragment which was used to create the read-pair. This includes the reference ID, the beginning and end coordinates of the fragment, and the orientation (F: forward, R: reverse).
For 3C-style pairs, the string is similar but since a fragment is the product of a ligation event, it encodes two reference regions.
Note Its worth noting that control bits and index fields should really be revised to be 0 and perhaps ACGT to at least comply with what is expected.
Thanks for the explanation!
Hi, could you please explain the meaning of each field in the header of output file? For example,
and
I ran sim3C with -m hic, however, some reads are assigned with "WGS", while others are assigned with "HIC". In the former cases, the last character is either F or R. It is a bit confusing.
Thank you!